Modeling with Observational Data

About This Presentation

Title:

Modeling with Observational Data

Description:

Modeling with Observational Data Michael Babyak, PhD All models are wrong, some are useful -- George Box A useful model is Not very biased Interpretable ... – PowerPoint PPT presentation

Number of Views:166

Avg rating:3.0/5.0

Slides: 148

Provided by: peopleDu7

Learn more at: https://people.duke.edu

Category:

more less

Transcript and Presenter's Notes

Title: Modeling with Observational Data

1
Modeling with Observational Data

Michael Babyak, PhD

2
What is a model ?
Y f(x1, x2, x3xn)
Y a b1x1 b2x2bnxn
Y e a b1x1 b2x2bnxn
3
All models are wrong, some are useful --
George Box

A useful model is
Not very biased
Interpretable
Replicable (predicts in a new sample)

4
(No Transcript)
5
Some Premises

Statistics is a cumulative, evolving field
Newer is not necessarily better, but should be
entertained in the context of the scientific
question at hand
Data analytic practice resides along a continuum,
from exploratory to confirmatory. Both are
important, but the difference has to be
recognized.
Theres no substitute for thinking about the
problem

6
Observational Studies

Underserved reputation
Especially if conducted and analyzed wisely
Biggest threats
Third Variable
Selection Bias (see above)
Poor Planning

7
Correlation between results of randomized trials
and observational studieshttp//www.epidemiologic
.org/2006/11/agreement-of-observational-and.html
8
Mean of Estimates
9
Head-to-head comparisons
10
(No Transcript)
11
Statistics is a cumulative, evolving field How
do we know this stuff?

Theory
Simulation

12
Concept of Simulation
Y b X error
bs1
bs2
bsk-1
bsk
bs3
bs4
.
13
Concept of Simulation
Y b X error
bs1
bs2
bsk-1
bsk
bs3
bs4
.
Evaluate
14
Simulation Example
Y .4 X error
bs1
bs2
bsk-1
bsk
bs3
bs4
.
15
Simulation Example
Y .4 X error
bs1
bs2
bsk-1
bsk
bs3
bs4
.
Evaluate
16
True ModelY .4x1 e
17
Ingredients of a Useful Model
Correct probability model
Based on theory
Good measures/no loss of information
Useful Model
Comprehensive
Parsimonious
Tested fairly
Flexible
18
Correct Model

Gaussian General Linear Model
Multiple linear regression
Binary (or ordinal) Generalized Linear Model
Logistic Regression
Proportional Odds/Ordinal Logistic
Time to event
Cox Regression or parametric survival models

19
Generalized Linear Model
Normal
Binary/Binomial
Count, heavy skew, Lots of zeros
Poisson, ZIP, negbin, gamma
General Linear Model/ Linear Regression
Logistic Regression
ANOVA/t-test ANCOVA
Chi-square
Regression w/ Transformed DV
Can be applied to clustered (e.g, repeated
measures data)
20
Factor Analytic Family
Structural Equation Models
Partial Least Squares
Latent Variable Models (Confirmatory Factor
Analysis)
Multiple regression
Common Factor Analysis
Principal Components
21
Use Theory

Theory and expert information are critical in
helping sift out artifact
Numbers can look very systematic when the are in
fact random
http//www.tufts.edu/gdallal/multtest.htm

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
Measure well

Adequate range
Representative values
Watch for ceiling/floor effects

29
Using all the information

Preserving cases in data sets with missing data
Conventional approaches
Use only complete case
Fill in with mean or median
Use a missing data indicator in the model

30
Missing Data

Imputation or related approaches are almost
ALWAYS better than deleting incomplete cases
Multiple Imputation
Full Information Maximum Likelihood

31
Multiple Imputation
32
http//www.lshtm.ac.uk/msu/missingdata/mi_web/node
5.html
33
(No Transcript)
34
Modern Missing Data Techniques

Preserve more information from original sample
Incorporate uncertainty about missingness into
final estimates
Produce better estimates of population (true)
values

35
Dont waste information from variables

Use all the information about the variables of
interest
Dont create clinical cutpoints before modeling
Model with ALL the data first, then use
prediction to make decisions about cutpoints

36
Dichotomizing for Convenience Dubious
Practice(C.R.A.P.)

Convoluted Reasoning and Anti-intellectual
Pomposity
Streiner Norman Biostatistics The Bare
Essentials

37
Implausible measurement assumption
not depressed
depressed
A
B
C
Depression score
38
Loss of power
http//psych.colorado.edu/mcclella/MedianSplit/
Sometimes through sampling error You can get a
lucky cut.
http//www.bolderstats.com/jmsl/doc/medianSplit.ht
ml
39
Dichotomization, by definition, reduces the
magnitude of the estimate by a minimum of about
30
Dear Project Officer, In order to facilitate
analysis and interpretation, we have decided to
throw away about 30 of our data. Even though
this will waste about 3 or 4 hundred thousand
dollars worth of subject recruitment and testing
money, we are confident that you will
understand. Sincerely, Dick O. Tomi, PhD Prof.
Richard Obediah Tomi, PhD
40
Power to detect non-zero b-weight when x is
continuous versus dichotomized
True model y .4x e
41
Dichotomizing will obscure non-linearity
Low
High
CESD Score
42
Dichotomizing will obscure non-linearity Same
data as previous slide modeled continuously
43
Type I error rates for the relation between x2
and y after dichotomizing two continuous
predictors. Maxwell and Delaney calculated the
effect of dichotomizing two continuous predictors
as a function of the correlation between them.
The true model is y .5x1 0x2, where all
variables are continuous. If x1 and x2 are
dichotomized, the error rate for the relation
between x2 and y increases as the correlation
between x1 and x2 increases.
Correlation between x1 and x2 Correlation between x1 and x2 Correlation between x1 and x2 Correlation between x1 and x2
N 0 .3 .5 .7
50 .05 .06 .08 .10
100 .05 .08 .12 .18
200 .05 .10 .19 .31
44
Is it ever a good idea to categorize
quantitatively measured variables?

Yes
when the variable is truly categorical
for descriptive/presentational purposes
for hypothesis testing, if enough categories are
made.
However, using many categories can lead to
problems of multiple significance tests and still
run the risk of misclassification

45
CONCLUSIONS

Cutting
Doesnt always make measurement sense
Almost always reduces power
Can fool you with too much power in some
instances
Can completely miss important features of the
underlying function
Modern computing/statistical packages can
handle continuous variables
Want to make good clinical cutpoints? Model
first, decide on cuts afterward.

46
Statistical Adjustment/Control

What does it mean to adjust or control for
another variable?

47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
Y
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
Difficulties

What if lines arent parallel?
What if poor overlap between groups?

60
A Note on Mediation vs Confounding

Mathematically identical no test can tell you
which is which
Depends on YOUR causal hypothesis
Criteria for either
All three variables, predictor,
confounder/mediator, outcome must be related

61
Possible Models
Initial condition all related
A
C
B
62
Possible Models
Initial condition all related
A
C
C
B
B
A
63
Possible Models
Typical regression result
A
C
B
64
Possible Models
Mediational relation between A and C
A
C
B
65
Possible Models
Spurious relation between A and C
A
C
B
66
Possible Models
Or worse
A
C
U
B
67

With cross-sectional design, best you can do is
say that observed relations are consistent/not
consistent with hypothesized relation
Prospective better but still vulnerable to
outside variables
Interpretation of mediator/confounding
distinction is entirely substantive

68
Not always clear difference between mediator and
confounder

Beware that adjustment for confounder might
actually be modeling an explanatory mechanism
E.g., relation between depression and mortality
Often adjust for medical comorbidity
Comorbidity however, might be a proxy for poor
self-care, which in turn is linked to depression

69
Sample size and the problem of underfitting vs
overfitting

Model assumption is that ALL relevant variables
be includedthe antiparsimony principle or As
big as a house.
Tempered by fact that estimating too many
unknowns with too little data will yield junk.
In other words, cant build a mansion with a
shantys worth of wood.

70
Sample Size Requirements

Linear regression
minimum of N 50 8/predictor (Green, 1990)or
maybe more? (Kelley Maxwell, 2003)
Logistic Regression
Minimum of N 10-15/predictor among smallest
group (Peduzzi et al., 1990a)
Survival Analysis
Minimum of N 10-15/predictor (Peduzzi et al.,
1990b)

71
Consequences of inadequate sample size

Lack of power for individual tests
Unstable estimates
Spurious good fitlots of unstable estimates will
produce spurious good-looking (big) regression
coefficients

72
All-noise, but good fit
R-squares from multivariable models where
population is completely random numbers
Events per predictor ratio
73
Simulation number of events/predictor ratio
Y .5x1 0x2 .2x3 0x4 -- Where r x1 x4
.4 -- N/p 3, 5, 10, 20, 50
74
Parameter stability and n/p ratio
75
Peduzzis Simulation number of events/predictor
ratio
P(survival) a b1NYHA b2CHF b3VES b4DM
b5STD b6HTN b7LVC --Events/p 2, 5,
10, 15, 20, 25 -- relative bias
(estimated b true b/true b)100
76
Simulation results number of events/predictor
ratio
77
Simulation results number of events/predictor
ratio
78
Approaches to variable selection

Stepwise automated selection
Pre-screening using univariate tests
Combining or eliminating redundant predictors
Fixing some coefficients
Theory, expert opinion and experience
Penalization/Random effects
Propensity Scoring
Matches individuals on multiple dimensions to
improve baseline balance
Tibshiranis Lasso

79
Any variable selection technique based on looking
at the data first will likely be biased
80

I now wish I had never written the stepwise
selection code for SAS.
--Frank Harrell, author of forward and backwards
selection algorithm for SAS PROC REG

81
Automated Selection Derksen and Keselman (1992)
Simulation Study

Studied backward and forward selection
Some authentic variables and some noise variables
among candidate variables
Manipulated correlation among candidate
predictors
Manipulated sample size

82
Automated Selection Derksen and Keselman (1992)
Simulation Study

The degree of correlation between candidate
predictors affected the frequency with which the
authentic predictors found their way into the
model.
The greater the number of candidate predictors,
the greater the number of noise variables were
included in the model.
Sample size was of little practical importance
in determining the number of authentic variables
contained in the final model.

83
Simulation results Number of noise variables
included
Sample Size
20 candidate predictors 100 samples
84
Simulation results R-square from noise variables
Sample Size
20 candidate predictors 100 samples
85
Simulation results R-square from noise variables
Sample Size
20 candidate predictors 100 samples
86
SOME of the problems with stepwise variable
selection.
1. It yields R-squared values that are badly
biased high 2. The F and chi-squared tests
quoted next to each variable on the printout do
not have the claimed distribution 3. The method
yields confidence intervals for effects and
predicted values that are falsely narrow (See
Altman and Anderson Stat in Med) 4. It yields
P-values that do not have the proper meaning and
the proper correction for them is a very
difficult problem 5. It gives biased regression
coefficients that need shrinkage (the
coefficients for remaining variables are too
large see Tibshirani, 1996). 6. It has severe
problems in the presence of collinearity 7. It
is based on methods (e.g. F tests for nested
models) that were intended to be used to test
pre-specified hypotheses. 8. Increasing the
sample size doesn't help very much (see Derksen
and Keselman) 9. It allows us to not think about
the problem 10. It uses a lot of paper
87
author Chatfield, C., title Model
uncertainty, data mining and statistical
inference (with discussion), journal JRSSA,
year 1995, volume 158, pages
419-466, annote --bias by
selecting model because it fits the data well
bias in standard errors P. 420 ... need for a
better balance in the literature and in
statistical teaching between techniques and
problem solving strategies. P. 421 It is well
known' to be logically unsound and practically
misleading' (Zhang, 1992) to make inferences as
if a model is known to be true when it has, in
fact, been selected from the same data to be used
for estimation purposes. However, although
statisticians may admit this privately (Breiman
(1992) calls it a quiet scandal'), they (we)
continue to ignore the difficulties because it is
not clear what else could or should be done. P.
421 Estimation errors for regression
coefficients are usually smaller than errors from
failing to take into account model specification.
P. 422 Statisticians must stop pretending that
model uncertainty does not exist and begin to
find ways of coping with it. P. 426 It is
indeed strange that we often admit model
uncertainty by searching for a best model but
then ignore this uncertainty by making inferences
and predictions as if certain that the best
fitting model is actually true.
88
Phantom Degrees of Freedom

Faraway (1992)showed that any pre-modeling
strategy cost a df over and above df used later
in modeling.
Premodeling strategies included variable
selection, outlier detection, linearity tests,
residual analysis.
Thus, although not accounted for in final model,
these phantom df will render the model too
optimistic

89
Phantom Degrees of Freedom

Therefore, if you transform, select, etc., you
must include the DF in (i.e., penalize for) the
Final Model

90
Conventional Univariate Pre-selection

Non-significant tests also cost a DF
Non-significance is NOT necessarily related to
importance
Variables may not behave the same way in a
multivariable modelvariable not significant at
univariate test may be very important in the
presence of other variables

91
Conventional Univariate Pre-selection

Despite the convention, testing for confounding
has not been systematically studiedin many cases
leads to overadjustment and underestimate of true
effect of variable of interest.
At the very least, pulling variables in and out
of models inflates the model fit, often
dramatically

92
Better approach

Pick variables a priori
Stick with them
Penalize appropriately for any data-driven
decision about how to model a variable

93
Spending DF wisely

If not enough N/predictor, combine covariates
using techniques that do not look at Y in the
sample, PCA, FA, conceptual clustering,
collapsing, scoring, established indexes.
Save DF for finer-grained look at variables of
most interest, e.g, non-linear functions

94
What to do

Penalization/Random effects
Propensity Scoring
Matches individuals on multiple dimensions to
improve baseline balance
Tibshiranis Lasso

95
Canadian Study Canadian Study Canadian Study Canadian Study UK Study UK Study UK Study UK Study UK Study US Study US Study US Study US Study US Study
No Smoke Cig. Cig./Pipe Cig./Pipe No Smoke No Smoke Cig. Cig./Pipe Cig./Pipe No Smoke No Smoke Cig. Cig./ Pipe Cig./ Pipe
A Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years Death Rates per 1,000 Person Years
20.2 20.5 20.5 35.5 35.5 11.3 14.1 14.1 20.7 20.7 13.5 13.5 13.5 17.4
B Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years Average Age in Years
54.9 50.5 50.5 65.9 65.9 49.1 49.8 49.8 55.7 55.7 57.0 53.2 53.2 59.7
C Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses Adjusted Death Rates Using K Subclasses
K2 20.2 26.4 26.4 24.0 24.0 11.3 12.7 12.7 13.6 13.6 13.5 16.4 16.4 14.9
K3 20.2 28.3 28.3 21.2 21.2 11.3 12.8 12.8 12.0 12.0 13.5 17.7 17.7 14.2
K 9-11 20.2 29.5 29.5 19.8 19.8 11.3 14.8 14.8 11.0 11.0 13.5 21.2 21.2 13.7
96
Propensity Score Example

Observational data on SSRI use in post myocardial
infarction patients
Early use of SSRI as an adjustment covariate
revealed excess risk for all-cause mortality
among SSRI users
Can use Propensity Score to help rule out
confounders

97
Step 1 Kitchen Sink Model predicting SSRI use

Why is it OK to use lots of predictors in this
case?
Working strictly at the sample level

98
(No Transcript)
99
Generate conditional probabilities of being on an
SSRI for each patient
ID probssri 1 0.07071829 2
0.10357308 3 0.08324767 4 0.09562251
5 0.10424651 6 0.28105882 7
0.09824793
100
Step 2 Remove non-overlapping cases
SSRI0
SSRI1
density
101
Perform primary analysis predicting survival

Surv ssri
Surv ssri logit(pssri)
Surv ssri logit(pssri) BDI
Surv ssri logit(pssri) BDI others

102
Step 3 Unadjusted estimate
Factor HR Lower 0.95 Upper
0.95 ssri 0.22 0.18 1.05
Hazard Ratio 1.85 1.20 2.86
103
Step 4 Adjusted for Propensity (linear)
Factor Effect S.E. Lower 0.95 Upper 0.95
ssri 0.61 0.24 0.15 1.08
Hazard Ratio 1.85 NA 1.16 2.95
LOGIT 0.00 0.14 -0.27 0.28
Hazard Ratio 1.00 NA 0.76 1.33
104
(No Transcript)
105
Better Step 4 Adjusted for Propensity
(non-linear)
Factor Effect S.E. Lower 0.95 Upper 0.95
ssri 0.55 0.24 0.07 1.03
Hazard Ratio 1.73 NA 1.07 2.79
LOGIT 0.02 0.25 -0.47 0.51
Hazard Ratio 1.02 NA 0.62 1.67
106
(No Transcript)
107
Limitations

Still may be differences/confounding not measured
and therefore not captured by propensity score
If poor overlap, limited generalizability
Many reviewers not familiar with it

108
What to do about heterogeneous slopes?

We know there is always heterogeneity of slopes,
perhaps even important
Proper test is product interaction termNOT
within subgroups tests (see BMJ series)
Increased error rate
Differential power
Danger of Accepting the null
Sparse cells and unstable estimates
Tension between low power of interaction and high
error rate/instability
Especially true in observational data
I honestly dont know what to doany ideas?

109
If you worry about Type I

Use pooled test (see, for example, Cohen Cohen
or Harrell)
If pooled test not significant, stop there

110
If Type II is a bigger concern

Report non-significant effects, acknowledging the
uncertainty, but conveying need to investigate
more
C.F. HRT data was there an age X HRT
interaction?

111
Validation

Apparent fit
Usually too optimistic
Internal
cross-validation, bootstrap
honest estimate for model performance
provides an upper limit to what would be found on
external validation
External validation
replication with new sample, different
circumstances

112
Validation

Steyerburg, et al. (1999) compared validation
methods
Found that split-half was far too conservative
Bootstrap was equal or superior to all other
techniques

113
Conclusions

Measure well
Use all the information
Recognize the limitations based on how much data
you actually have
In the confirmatory mode, be as explicit as
possible about the model a priori, test it, and
live with it
By all means, explore data, but recognize and
state frankly --the limits post hoc analysis
places on inference

114
http//myspace.com/monkeynavigatedrobots
115
Advanced topics and examples
116
Bootstrap
My Sample
?1
?2
?3
?k-1
?k
?4
.
WITH REPLACEMENT
Evaluate
117
1, 3, 4, 5, 7, 10
7 1 1 4 5 10
10 3 2 2 2 1
3 5 1 4 2 7
2 1 1 7 2 7
4 4 1 4 2 10
118
Can use data to determine where to spend DF

Use Spearmans Rho to test importance
Not peeking because we have chosen to include the
term in the model regardless of relation to Y
Use more DF for non-linearity

119
Example-Predict Survival from age, gender, and
fare on Titanicexample using R software
120
If you have already decided to include them (and
promise to keep them in the model) you can peek
at predictors in order to see where to add
complexity
121
(No Transcript)
122
Non-linearity using splines
123
Linear Spline (piecewise regression)
Y a b1(xlt10) b2(10ltxlt20) b3 (x gt20)
124
Cubic Spline (non-linear piecewise regression)
knots
125
Logistic regression model
fitfarelt-lrm(survived(rcs(fare,3)agesex)2,xT,
yT) anova(fitfare)
Spline with 3 knots
126
Wald Statistics Response survived
Factor
Chi-Square d.f. P fare (FactorHigher
Order Factors) 55.1 6 lt.0001 All
Interactions 13.8 4
0.0079 Nonlinear (FactorHigher Order
Factors) 21.9 3 0.0001 age
(FactorHigher Order Factors) 22.2 4
0.0002 All Interactions
16.7 3 0.0008 sex (FactorHigher
Order Factors) 208.7 4 lt.0001
All Interactions 20.2
3 0.0002 fare age (FactorHigher Order
Factors) 8.5 2 0.0142 Nonlinear
8.5 1 0.0036
Nonlinear Interaction f(A,B) vs. AB 8.5
1 0.0036 fare sex (FactorHigher Order
Factors) 6.4 2 0.0401 Nonlinear
1.5 1 0.2153
Nonlinear Interaction f(A,B) vs. AB 1.5
1 0.2153 age sex (FactorHigher Order
Factors) 9.9 1 0.0016 TOTAL NONLINEAR
21.9 3 0.0001
TOTAL INTERACTION 24.9
5 0.0001 TOTAL NONLINEAR INTERACTION
38.3 6 lt.0001 TOTAL
245.3 9 lt.0001
127
Wald Statistics Response survived
Factor
Chi-Square d.f. P fare (FactorHigher
Order Factors) 55.1 6 lt.0001 All
Interactions 13.8 4
0.0079 Nonlinear (FactorHigher Order
Factors) 21.9 3 0.0001 age
(FactorHigher Order Factors) 22.2 4
0.0002 All Interactions
16.7 3 0.0008 sex (FactorHigher
Order Factors) 208.7 4 lt.0001
All Interactions 20.2
3 0.0002 fare age (FactorHigher Order
Factors) 8.5 2 0.0142 Nonlinear
8.5 1 0.0036
Nonlinear Interaction f(A,B) vs. AB 8.5
1 0.0036 fare sex (FactorHigher Order
Factors) 6.4 2 0.0401 Nonlinear
1.5 1 0.2153
Nonlinear Interaction f(A,B) vs. AB 1.5
1 0.2153 age sex (FactorHigher Order
Factors) 9.9 1 0.0016 TOTAL NONLINEAR
21.9 3 0.0001
TOTAL INTERACTION 24.9
5 0.0001 TOTAL NONLINEAR INTERACTION
38.3 6 lt.0001 TOTAL
245.3 9 lt.0001
128
Wald Statistics Response survived
Factor
Chi-Square d.f. P fare (FactorHigher
Order Factors) 55.1 6 lt.0001 All
Interactions 13.8 4
0.0079 Nonlinear (FactorHigher Order
Factors) 21.9 3 0.0001 age
(FactorHigher Order Factors) 22.2 4
0.0002 All Interactions
16.7 3 0.0008 sex (FactorHigher
Order Factors) 208.7 4 lt.0001
All Interactions 20.2
3 0.0002 fare age (FactorHigher Order
Factors) 8.5 2 0.0142 Nonlinear
8.5 1 0.0036
Nonlinear Interaction f(A,B) vs. AB 8.5
1 0.0036 fare sex (FactorHigher Order
Factors) 6.4 2 0.0401 Nonlinear
1.5 1 0.2153
Nonlinear Interaction f(A,B) vs. AB 1.5
1 0.2153 age sex (FactorHigher Order
Factors) 9.9 1 0.0016 TOTAL NONLINEAR
21.9 3 0.0001
TOTAL INTERACTION 24.9
5 0.0001 TOTAL NONLINEAR INTERACTION
38.3 6 lt.0001 TOTAL
245.3 9 lt.0001
129
Wald Statistics Response survived
Factor
Chi-Square d.f. P fare (FactorHigher
Order Factors) 55.1 6 lt.0001 All
Interactions 13.8 4
0.0079 Nonlinear (FactorHigher Order
Factors) 21.9 3 0.0001 age
(FactorHigher Order Factors) 22.2 4
0.0002 All Interactions
16.7 3 0.0008 sex (FactorHigher
Order Factors) 208.7 4 lt.0001
All Interactions 20.2
3 0.0002 fare age (FactorHigher Order
Factors) 8.5 2 0.0142 Nonlinear
8.5 1 0.0036
Nonlinear Interaction f(A,B) vs. AB 8.5
1 0.0036 fare sex (FactorHigher Order
Factors) 6.4 2 0.0401 Nonlinear
1.5 1 0.2153
Nonlinear Interaction f(A,B) vs. AB 1.5
1 0.2153 age sex (FactorHigher Order
Factors) 9.9 1 0.0016 TOTAL NONLINEAR
21.9 3 0.0001
TOTAL INTERACTION 24.9
5 0.0001 TOTAL NONLINEAR INTERACTION
38.3 6 lt.0001 TOTAL
245.3 9 lt.0001
130
Wald Statistics Response survived
Factor
Chi-Square d.f. P fare (FactorHigher
Order Factors) 55.1 6 lt.0001 All
Interactions 13.8 4
0.0079 Nonlinear (FactorHigher Order
Factors) 21.9 3 0.0001 age
(FactorHigher Order Factors) 22.2 4
0.0002 All Interactions
16.7 3 0.0008 sex (FactorHigher
Order Factors) 208.7 4 lt.0001
All Interactions 20.2
3 0.0002 fare age (FactorHigher Order
Factors) 8.5 2 0.0142 Nonlinear
8.5 1 0.0036
Nonlinear Interaction f(A,B) vs. AB 8.5
1 0.0036 fare sex (FactorHigher Order
Factors) 6.4 2 0.0401 Nonlinear
1.5 1 0.2153
Nonlinear Interaction f(A,B) vs. AB 1.5
1 0.2153 age sex (FactorHigher Order
Factors) 9.9 1 0.0016 TOTAL NONLINEAR
21.9 3 0.0001
TOTAL INTERACTION 24.9
5 0.0001 TOTAL NONLINEAR INTERACTION
38.3 6 lt.0001 TOTAL
245.3 9 lt.0001
131
(No Transcript)
132
(No Transcript)
133
(No Transcript)
134
Bootstrap Validation
Index Training Corrected
Dxy 0.6565 0.646
R2 0.4273 0.407
Intercept 0.0000 -0.011
Slope 1.0000 0.952
135
Summary

Think about your model
Collect enough data

136
Summary

Measure well
Dont destroy what youve measured

137
Summary

Pick your variables ahead of time and collect
enough data to test the model you want
Keep all your variables in the model unless
extremely unimportant

138
Summary

Use more df on important variables, fewer df on
nuisance variables
Dont peek at Y to combine, discard, or transform
variables

139
Summary

Estimate validity and shrinkage with bootstrap

140
Summary

By all means, tinker with the model later, but be
aware of the costs of tinkering
Dont forget to say you tinkered
Go collect more data

141
Web links for references, software, and more

Harrells regression modeling text
http//hesweb1.med.virginia.edu/biostat/rms/
R software
http//cran.r-project.org/
SAS Macros for spline estimation
http//hesweb1.med.virginia.edu/biostat/SAS/survri
sk.txt
Some results comparing validation methods
http//hesweb1.med.virginia.edu/biostat/reports/lo
gistic.val.pdf
SAS code for bootstrap
ftp//ftp.sas.com/pub/neural/jackboot.sas
S-Plus home page
insightful.com
Mike Babyaks e-mail
michael.babyak_at_duke.edu
This presentation
http//www.duke.edu/mababyak

142

www.duke.edu/mababyak
michael.babyak _at_ duke.edu
symptomresearch.nih.gov/chapter_8/

143
Observational Data and Clinical
Trials http//www.epidemiologic.org/2006/11/agreem
ent-of-observational-and.html http//www.epidemio
logic.org/2006/10/resolving-differences-of-studies
-of.html Propensity Scoring Rubin Symposium
notes http//www.symposion.com/nrccs/rubin.htm Ro
senbaum, P.R. and Rubin, D.B. (1984). "Reducing
bias in observational studies using
sub-classification on the propensity score."
Journal of the American Statistical Association,
79, pp. 516-524. Pearl, J. (2000). Causality
Models, Reasoning, and Inference, Cambridge
University Press. Rosenbaum, P. R., and Rubin,
D. B., (1983), "The Central Role of the
Propensity Score in Observational Studies for
Causal Effects, Biometrica, 70, 41-55.
Mediation and Confounding MacKinnon DP, Krull
JL, Lockwood CM. Equivalence of the mediation,
confounding and suppression effect. Prev Sci
(2000) 117381
144
General Modeling Harrell FE Jr. Regression
modeling strategies with applications to linear
models, logistic regression and survival
analysis. New York Springer 2001. Sample
Size Kelley, K. Maxwell, S. E. (2003). Sample
size for Multiple Regression Obtaining
regression coefficients that are accuracy, not
simply significant. Psychological Methods, 8,
305321. Kelley, K. Maxwell, S. E. (In
press). Power and Accuracy for Omnibus and
Targeted Effects Issues of Sample Size Planning
with Applications to Multiple Regression Handbook
of Social Research Methods, J. Brannon, P.
Alasuutari, and L. Bickman (Eds.). New York, NY
Sage Publications. Green SB. How many subjects
does it take to do a regression analysis?
Multivar Behav Res 1991 26 499510. Peduzzi
PN, Concato J, Holford TR, Feinstein AR. The
importance of events per independent variable in
multivariable analysis, II accuracy and
precision of regression estimates. J Clin
Epidemiol 1995 48 150310 Peduzzi PN, Concato
J, Kemper E, Holford TR, Feinstein AR. A
simulation study of the number of events per
variable in logistic regression analysis. J Clin
Epidemiol 1996 49 13739.
145
Dichotomization Cohen, J. (1983) The cost of
dichotomization. Applied Psychological
Measurement, 7, 249-253. MacCallum R.C., Zhang,
S., Preacher, K.J., Rucker, D.D. (2002). On the
practice of dichotomization of quantitative
variables. Psychological Methods, 7(1), 19-40.
Maxwell, SE, Delaney, HD (1993). Bivariate
median splits and spurious statistical
significance. Psychological Bulletin, 113,
181-190 Royston, P., Altman, D. G.,
Sauerbrei, W. (2006) Dichotomizing continuous
predictors in multiple regression a bad idea.
Statistics in Medicine, 25,127-141.
http//biostat.mc.vanderbilt.edu/twiki/bin/view/
Main/CatContinuous
146
Pretesting Grambsch PM, OBrien PC. The effects
of preliminary tests for nonlinearity in
regression. Stat Med 1991 10 697709. Faraway
JJ. The cost of data analysis. J Comput Graph
Stat 1992 1 21329. Validaton and
Penalization Steyerberg EW, Harrell FE Jr,
Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema
JD. Internal validation of predictive models
efficiency of some procedures for logistic
regression analysis. J Clin Epidemiol 2001 54
77481. Tibshirani R. Regression shrinkage and
selection via the lasso. J R Stat Soc B 2003 58
26788. Greenland S . When should epidemiologic
regressions use random coefficients? Biometrics
2000 Sep 56(3)915-21 Moons KGM, Donders ART,
Steyerberg EW, Harrell FE (2004) Penalized
maximum likelihood estimation to directly adjust
diagnostic and prognostic prediction models for
overoptimism a clinical example. J Clin
Epidemiol 2004571262-1270. Steyerberg EW,
Eijkemans MJ, Habbema JD. Application of
shrinkage techniques in logistic regression
analysis a case study. Stat Neerl 2001
5576-88.
147
Variable Selection Thompson B. Stepwise
regression and stepwise discriminant analysis
need not apply here a guidelines editorial. Ed
Psychol Meas 1995 55 52534. Altman DG,
Andersen PK. Bootstrap investigation of the
stability of a Cox regression model. Stat Med
2003 8 77183. Derksen S, Keselman HJ.
Backward, forward and stepwise automated subset
selection algorithms frequency of obtaining
authentic and noise variables. Br J Math Stat
Psychol 1992 45 26582. Steyerberg EW,
Harrell FE, Habbema JD. Prognostic modeling with
logistic regression analysis in search of a
sensible strategy in small data sets. Med Decis
Making 2001 21 4556. Cohen J. Things I have
learned (so far). Am Psychol 1990 45 130412.
Roecker EB. Prediction error and its estimation
for subset-selected models Technometrics 1991
33 45968.

Write a Comment

User Comments (0)