Title: Combining GLM and data mining techniques
1Combining GLM and data mining techniques
- Greg Taylor
- Taylor Fry Consulting Actuaries
- University of Melbourne
- University of New South Wales
- Casualty Actuarial Society
- Special Interest Seminar on Predictive Modeling
- Boston, October 4-5 2006
2Overview
- Examine general form of model of claims data
- Examine the specific case of a GLM to represent
the data - Consider how the GLM structure is chosen
- Introduce and discuss Artificial Neural Networks
(ANNs) - Consider how these may assist in formulating a
GLM - Presentation draws heavily on work of colleague
Dr Peter Mulquiney
3Model of claims data
- General form of claims data model
- Yi f(Xi Ăź) ei
- Yi some observation on claims experience
- Ăź vector of parameters that apply to all
observations - Xi vector of attributes (covariates) of i-th
observation - ei vector of centred stochastic error terms
4Model of claims data
- General form of claims data model
- Yi f(Xi Ăź) ei
- Yi some observation on claims experience
- Ăź vector of parameters that apply to all
observations - Xi vector of attributes (covariates) of i-th
observation - ei vector of centred stochastic error terms
- Examples
- Yi Yad paid losses in (a,d) cell
- a accident period
- d development period
- Yi cost of i-th completed claim
5Examples (contd)
- Yad paid losses in (a,d) cell
- EYad Ăźd Sr1d-1 Yar (chain ladder)
6Examples (contd)
- Yad paid losses in (a,d) cell
- EYad Ăźd Sr1d-1 Yar (chain ladder)
- EYad A db exp(-cd) exp aĂź ln d - ?d
(Hoerl curve for each accident periods payments)
7Examples (contd)
- Yad paid losses in (a,d) cell
- EYad Ăźd Sr1d-1 Yar (chain ladder)
- EYad A db exp(-cd) exp aĂź ln d - ?d
(Hoerl curve for each accident periods payments) - Yi cost of i-th completed claim
- Yi Gamma
- EYi exp aĂź ti
- where
- ai accident period to which i-th claim belongs
- ti operational time at completion of i-th claim
- proportion of claims from the accident
period ai completed before i-th claim
8Examples of individual claim models
- More generally
- EYi
- exp function of operational time
9Examples of individual claim models (contd)
- More generally
- EYi
- exp function of operational time
- function of accident period (legislative
change)
10Examples of individual claim models (contd)
- More generally
- EYi
- exp function of operational time
- function of accident period (legislative
change) - function of completion period (superimposed
inflation)
11Examples of individual claim models (contd)
- More generally
- EYi
- exp function of operational time
- function of accident period (legislative
change) - function of completion period (superimposed
inflation) - joint function (interaction) of operational
time accident period (change in payment pattern
attributable to legislative change)
12Examples of individual claim models (contd)
- Models of this type may be very detailed
- May include
- Operational time effect (payment pattern)
- Seasonality
- Creeping change in payment pattern
- Abrupt change in payment pattern
- Accident period effect (legislative change)
- Completion quarter effect (superimposed
inflation) - Variations in superimposed inflation over time
- Variations of superimposed inflation with
operational time - etc
13Identification of data features
- Typically largely ad hoc, using
- Trial and error regressions
- Diagnostics, e.g. residual plots
14Identification of data features - illustration
- Modelling about 60,000 Auto Bodily Injury claims
- First fitting just an operational time effect
15Identification of data features - illustration
- But there appear to be unmodelled trends by
- Accident quarter
- Completion (finalisation) quarter
16Identification of data features - illustration
- Final model includes terms for
- Operational time
- Seasonality
- Claim frequency
- Decrease induces increased claim sizes
- Accident quarter
- Change in Scheme rules
- Change in operational time effect with change in
Scheme rules - Superimposed inflation
- Varying with operational time
17Identification of data features alternative
approach
- Final model is complex in structure
- Structure identified in ad hoc manner
- More rigorous approach desirable
- Try Artificial Neural Network (ANN)
- Essentially a form of non-linear regression
18(Feed-forward) ANN for regression problem Y
f(X)
- Start with vector of P inputs X xp
- Create hidden layer with M hidden units
- Make M linear combinations of inputs
- Linear combinations then passed through layer of
activation functions g(hm)
19ANN for Regression problem Y f(X)
- Activation function
- Commonly a sigmoidal curve
- Function ? introduces non-linearity to model
- ? keeps response bounded
20ANN for Regression problem Y f(X)
- Y is then given by a linear combination of the
outputs from the hidden layer - This function can describe any continuous
function - 2 hidden layers ? ANN can describe any function
21Illustration of ANN
Wm
g
Zm
hm
wm
Xi
22Training of ANN
- Weights are usually determined by minimising the
least-squares error - Weight decay penalty function stops overfitting
- Larger ? ? smaller weights
- Smaller weights ? smoother fit
23Training of ANN - example
- Training data set 70 of available data
- Test data set 30 of available data
- Network structure
- Single hidden layer
- 20 units
- Weight decay ?0.05
- These tuning parameters determined by
cross-validation - Prediction error in test data set
24Comparison of GLM and ANN
- GLM
- Average absolute error
- 33,777
- ANN
- Average absolute error
- 33,559
25GLM and ANN forecasts
- Both by simple extrapolation of trends here
- ANN case
- Development quarter 10 red
- Development quarter 20 green
- Development quarter 30 yellow
- Development quarter 40 blue
- Note negative superimposed inflation
- May be undesirable
ANN extrapolation
26GLM and ANN forecasts
- Note negative superimposed inflation
- May be undesirable
- But ANN useful in searching out general form of
past superimposed inflation - Which can then be modelled explicitly in GLM
ANN extrapolation
27Application of ANN
- Generalisation of preceding remark
- ANN may be most useful as an automated tool for
seeking out detailed trends in data - Apply ANN to data set
- Study trends in fitted model against a range of
predictors or pairs of predictors - Use this knowledge to choose the functional forms
of included in the linear predictor of the GLM
28Application of ANN (contd)
- Ultimate test of the GLM is to apply ANN to its
residuals, seeking structure - There should be none
- The example indicates that the chosen GLM
structure may - Over-estimate the more recent experience at the
mid-ages of claim - Under-estimate it at the older ages
29Conclusions
- GLMs provide a powerful and flexible family of
models for claims data - Complex GLM structures may be required for
adequate representation of the data - The identification of these may be difficult
- The identification procedures are likely to be ad
hoc - ANNs provide an alternative form of non-linear
regression - These are likely to involve their own
shortcomings if left to stand on their own - They may, however, provide considerable
assistance if used in parallel with GLMs to
identify GLM structure