Title: Validation of predictive regression models
1Validation of predictive regression models
- Ewout W. Steyerberg, PhD
- Clinical epidemiologist
- Frank E. Harrell, PhD
- Biostatistician
2Personal background
- Ewout Steyerberg Erasmus MC, Rotterdam, the
Netherlands - Frank Harrell Health Evaluation Sciences, Univ
of Virginia, Charlottesville, VA, USA - Validation of predictions from regression models
is of paramount importance
3Learning objectives knowledge of
- common types of regression models
- fundamental assumptions of regression models
- performance criteria of predictive models
- principles of different types of validation
4Performance objectives
- To be able to explain why validation is necessary
for predictive models - To be able to judge the adequacy of a validation
procedure
5Predictive models provide quantitative estimates
of an outcome, e.g.
- Quality of life one year after surgery
- Death at 30 days after surgery
- Long term survival
6Predictive models are often based on regression
analysis
- y a sum(bixi)
- y outcome variable
- a intercept
- bi regression coefficient i
- xi predictor variable i
- i in 1,many, usually 2 to 20
73 examples of regression
- Quality of life one year after surgery
- continuous outcome, linear regression
- Death at 30 days after surgery
- binary outcome, logistic regression
- Long term survival
- time-to-outcome, Cox regression
8Predictive models make assumptions
- Distribution
- Linearity of continuous variables
- Additivity of effects
9Example a simple logistic regression model
- 30day mortality a b1sex b2age
- Assumptions
- Distribution of 30day mortality is binomial
- Age has a linear effect
- The effects of sex and age can be added
10Assessing model assumptions
- Examine model residuals
- Perform specific tests
- add nonlinear terms, e.g. ageage2
- add interaction terms, e.g. sexage
11Model assumptions and predictions
- Better predictions if assumptions are met
- Some violation inherent in empirical data
- Evaluate predictions in new data
12Evaluation of predictions
- Calibration
- average of predictions correct?
- low and high predictions correct?
- Discrimination
- distinguish low risk from high risk patients?
13Example predicted probabilities
143 types of validation
- Apparent performance on sample used to develop
model - Internal performance on population underlying
the sample - External performance on related but slightly
different population
15Apparent validity
- Easy to calculate
- Results in optimistic performance estimates
16Apparent estimates optimistic since same data
used for
- Definition of model structure e.g. selection
and coding of variables - Estimation of model parameters e.g. regression
coefficients - Evaluation of model performance e.g.
calibration and discrimination
17Internal validity
- More difficult to calculate
- Test model in new data, random from underlying
population
18Why internal validation?
- Honest estimate of performance should be
obtained, at least for a population similar to
the development sample - Internal validated performance sets an upper
limit to what may be expected in other settings
(external validity)
19External validity
- Moderately easy to calculate when new data are
available - Test model in new data, different from
development population
20Why external validation?
- Various factors may differ from development
population, including - different selection of patients
- different definitions of variables
- different diagnostic or therapeutic procedures
21Internal validation techniques
- Split-sample
- development / validation
- Cross-validation
- alternating development / validation
- extreme n-1 develop / 1 validate (jack-knife)
- Bootstrap
22Bootstrap is the preferred internal validation
technique
- bootstrap sample for model development n
patients drawn with replacement - original sample for validation n patients
- difference optimism
- efficiency development and validation on n
patients
23Example bootstrap results for logistic
regression model
- 30-day mortality a b1sex b2age
- Apparent area under the ROC curve 0.77
- Mean area of 200 bootstrap samples0.772
- Mean area of 200 tests in original 0.762
- Optimism in apparent performance 0.01
- Optimism-corrected area 0.76
24External validation techniques
- Temporal validation same investigators, validate
in recent years - Spatial validation (other place) same
investigators, cross-validate in centers - Fully external other investigators, other centers
25Example external validity of logistic regression
model
- 30-day mortality a b1sex b2age
- Apparent area in 785 patients 0.77
- Tested in 20,318 other patients 0.74
- Tested by other investigators ?
26Example external validation
27Summary
- Apparent validity gives an optimistic estimate of
model performance - Internal validity may be estimated by
bootstrapping - External validity should be determined in other
populations
28Key references
- tutorial and book on multivariable
models(Harrell 1996, Stat Med 15361-87
Harrell regression modeling strategies,
Springer 2001) - empirical evaluations of strategies (Steyerberg
2000 Stat Med19 1059-79) - internal validation (Steyerberg 2001JCE 54
774-81) - external validation (Justice 1999 Ann Intern
Med 130515-24 Altman 2000 Stat Med 19 453-73)
29Links
- Interactive text book on predictive
modelinghttp//www.neri.org/symptom/mockup/Chapte
r_8/ - Harrells Regression modeling strategieshttp//he
sweb1.med.virginia.edu/biostat/rms/