Toward a unified approach to fitting loss models - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

Toward a unified approach to fitting loss models

Description:

... be used to compare different models. Representing the ... Comparing models. Good picture. Better test numbers. Likelihood criterion such as Schwarz Bayesian. ... – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 35
Provided by: stuartk6
Category:

less

Transcript and Presenter's Notes

Title: Toward a unified approach to fitting loss models


1
Toward a unified approach to fitting loss models
  • Jacques Rioux and Stuart Klugman, for
    presentation at the IAC, Feb. 9, 2004

2
Handout/slides
  • E-mail me
  • Stuart.klugman_at_drake.edu

3
Overview
  • What problem is being addressed?
  • The general idea
  • The specific ideas
  • Models to consider
  • Recording the data
  • Representing the data
  • Testing a model
  • Selecting a model

4
The problem
  • Too many models
  • Two books 26 distributions!
  • Can mix or splice to get even more
  • Data can be confusing
  • Deductibles, limits
  • Too many tests and plots
  • Chi-square, K-S, A-D, p-p, q-q, D

5
The general idea
  • Limited number of distributions
  • Standard way to present data
  • Retain flexibility on testing and selection

6
Distributions
  • Should be
  • Familiar
  • Few
  • Flexible

7
A few familiar distributions
  • Exponential
  • Only one parameter
  • Gamma
  • Two parameters, a mode if agt1.
  • Lognormal
  • Two parameters, a mode
  • Pareto
  • Two parameters, a heavy right tail

8
Flexible
  • Add by allowing mixtures
  • That is,
  • where
  • and all
  • Some restrictions
  • Only the exponential can be used more than once.
  • Cannot use both the gamma and lognormal.

9
Why mixtures?
  • Allows different shape at beginning and end (e.g.
    mode from lognormal, tail from Pareto).
  • By using several exponentials can have most any
    tail weight (see Keatinge).

10
Estimating parameters
  • Use only maximum likelihood
  • Asymptotically optimal
  • Can be applied in all settings, regardless of the
    nature of the data
  • Likelihood value can be used to compare different
    models

11
Representing the data
  • Why do we care?
  • Graphical tests require a graph of the empirical
    density or distribution function.
  • Hypothesis tests require the functions themselves.

12
What is the issue?
  • None if,
  • All observations are discrete or grouped
  • No truncation or censoring
  • But if so,
  • For discrete data the Kaplan-Meier product-limit
    estimator provides the empirical distribution
    function (and is the nonparametric mle as well).

13
Issue grouped data
  • For grouped data,
  • If completely grouped, the histogram represents
    the pdf, the ogive the cdf.
  • If some grouped, some not, or multiple
    deductibles, limits, our suggestion is to replace
    the observations in the interval with that many
    equally spaced points.

14
Review
  • Given a data set, we have the following
  • A way to represent the data.
  • A limited set of models to consider.
  • Parameter estimates for each model.
  • The remaining tasks are
  • Decide which models are acceptable.
  • Decide which model to use.

15
Example
  • The paper has two example, we will look only at
    the second one.
  • Data are individual payments, but the policies
    that produced them had different deductibles
    (100, 250, 500) and different maximum payments
    (1,000, 3,000, 5,000).
  • There are 100 observations.

16
Empirical cdf
17
Distribution function plot
  • Plot the empirical and model cdfs together. Note,
    because in this example the smallest deductible
    is 100, the empirical cdf begins there.
  • To be comparable, the model cdf is calculated as

18
Example model
  • All plots and tests that follow are for a mixture
    of a lognormal and exponential distribution. The
    parameters are

19
Distribution function plot
20
Confidence bands
  • It is possible to create 95 confidence bands.
    That is, we are 95 confident that the true
    distribution is completely within these bands.
  • Formulas adapted from Klein and Moeschberger with
    a modification for multiple truncation points
    (their formula allows only multiple censoring
    points).

21
CDF plot with bounds
22
Other CDF pictures
  • Any function of the cdf, such as the limited
    expected value, could be plotted.
  • The only one shown here is the difference plot
    magnify the previous plot by plotting the
    difference of the two distribution functions.

23
CDF difference plot
24
Histogram plot
  • Plot a histogram of the data against the density
    function of the model.
  • For data that were not grouped, can use the
    empirical cdf to get cell probabilities.

25
Histogram plot
26
Hypothesis tests
  • Null-model fits
  • Alternative-it doesnt
  • Three tests
  • Kolmogorov-Smirnov
  • Anderson-Darling
  • Chi-square

27
Kolmogorov-Smirnov
  • Test statistic is maximum difference between the
    empirical and model cdfs. Each difference is
    multiplied by a scaling factor related to the
    sample size at that point.
  • Critical values are way off when parameters
    estimated from data.

28
Anderson-Darling
  • Test statistic looks complex
  • where e is empirical and m is model.
  • The paper shows how to turn this into a sum.
  • More emphasis on fit in tails than for K-S test.

29
Chi-square test
  • You have seen this one before.
  • It is the only one with an adjustment for
    estimating parameters.

30
Results
  • K-S 0.5829
  • A-D 0.2570
  • Chi-square p-value of 0.5608
  • The model is clearly acceptable. Simulation
    study needed to get p-values for these tests.
    Simulation indicates that the p-values are over
    0.9.

31
Comparing models
  • Good picture
  • Better test numbers
  • Likelihood criterion such as Schwarz Bayesian.
    The SBC is the loglikelihood minus (r/2)ln(n)
    where r is the number of parameters and n is the
    sample size.

32
Several models
33
Which is the winner?
  • Referee A loglikelihood rules pick
    gamma/exp/exp mixture
  • This is a world of one big model and the best is
    the best, simplicity is never an issue.
  • Referee B SBC rules pick exponential
  • Parsimony is most important, pay a penalty for
    extra parameters.
  • Me lognormal/exp. Great pictures, better
    numbers than exponential, but simpler than three
    component mixture.

34
Can this be automated?
  • We are working on software
  • Test version can be downloaded at
    www.cbpa.drake.edu/mixfit.
  • MLEs are good. Pictures and test statistics are
    not quite right.
  • May crash.
  • Here is a quick demo.
Write a Comment
User Comments (0)
About PowerShow.com