Title: Indices of the Robustness of Causal Inferences from Statistical Analyses
1Indices of the Robustness of Causal Inferences
from Statistical Analyses
- Kenneth A Frank
- Michigan State University
- AERA Stat Institute
2Motivating Question
- Assume you have a significant result.
- Q. But have you controlled for ...?
- Inference could always be altered by
unmeasured confound - Q. Is the causal mechanism consistent across
sub-populations? - If not, do you really understand the causal
mechanisms?
3Initial Regression Results for Mloipe (Amount of
Intended Exposure to Math Content)
- NAME beta SEbeta t
- INTERVEN 11.521 2.064 5.581
- (algebra 1 by 9th grade)
- PREGPA -0.128 0.985 -0.130
- PVT -0.036 0.065 -0.561
- LGINC -0.760 1.122 -0.677
- PARED6 0.363 0.661 0.548
- FEMALE 1.548 1.617 0.957
4Potential Confounding Variable
- But those who took algebra I may have been more
inclined to pursue higher level material anyway
because of educational aspirations.
5The Impact of a Confounding Variable on a
Regression Coefficient
6What must be impact to alter Inference?
- y outcome (mloipe)
- x predictor of interest (on math track)
- rxy the correlation between x (social capital)
and y (use of computers).26 - r x cv correlation between x and an
unmeasured confounding variable - ry cv correlation between y and an unmeasured
confounding variable - k r x cv ry cv Unmeasured Impact of
Confound - Question
- What must be k to alter the inference regarding
math track on mloipe?
7Defining a Threshold for Inference
- Define r as the value of r that is just
statistically significant
n is the sample size q is the number of
parameters estimated r can also be defined in
terms of effect sizes or correlation coefficients.
8Defining the Threshold for Impact
Assuming rxcv rycv (which maximizes the impact
of the confounding variable)
Set rxycv r and solve for k to find the
threshold for the impact of a confounding
variable (TICV).
Thus the inference is altered if k is greater
than the TICV.
9Multivariate Extension, with Covariates
krx cvz ry cvz Maximizing the impact
with covariates z in the model implies
And
10Application to Math Track and MLOIPE
rinterven mloipez.195
TICV.129 including multivariate correction
11What must be the Impact to Alter the Inference?
- If k .129 then the inference would be altered.
- If r x cv ry cv, then each would have to be
greater than k1/2 .37 to alter the inference. - (multivariate correction, ry cv .38 and r x cv
.34) - Furthermore, correlations must be partialled for
covariates z. - Impacts of existing covariates on Math track are
all less than .002.
12Actual Impact of College Aspirations (CASP1)
- Parameter Standard
- Variable Estimate Error t Value
Pr t - Intercept 82.01538 7.28308 11.26
- interven 11.67616 2.21234 5.28
- pregpa -0.39604 1.06669 -0.37
0.7105 - pvt -0.01879 0.06339 -0.30
0.7670 - lginc -0.54225 1.07824 -0.50
0.6152 - pared6 0.57486 0.67084 0.86
0.3918 - female 1.93583 1.62987 1.19
0.2353 - casp1 1.88547 0.89355 2.11
0.0352 - Impact of CASP1 on Intervention .093 .111.01.
- Impact needed to be larger than .129 to alter the
inference. - Inference retained.
- What about the next unmeasured confound?
- TICV(Interven).120, still a high threshold, at
what point do we approach the confidence of a
randomized experiment?
13Impact threshold of a Suppressor with Covariates
Se r x negative and statistically significant
And
Note if r0, then TISVTICVrxyz impact
must equal original correlation to reduce
coefficient to zero.
14Non-additive Effects
- Q. Does the relationship apply in other
populations? - What if we had analyzed a sample of 9th and 10th
graders (at wave I) instead of just 9th graders? - What would relationship have to be in alternative
sample to change inference from a combined
sample? - What would the relationship between interven and
Mloipe have to be among 10th graders such that if
used to replace some 9th graders the inference
would be altered?
15Correlation for Observed and Unobserved Data
- Define ? as the proportion of the sample that is
replaced with an alternate sample. - r is correlation in unobserved data
- R is combined correlation for observed and
unobserved data - Rxy(1-?)rxy ?rxy .
16Thresholds for Sample Replacement
- Set Rr and solve for rxy
- If half the sample is replaced (?.5), inference
is altered if - rxy TR(?.5)
- If rxy 0, inference is altered if
- 1-r/rxy threshold for replacement TR(rxy0)
- Assumes means and variances are constant across
samples, alternative calculations available.
17Example of Thresholds for Replacement
- TR(?.5) 2r-rxy 2(.071)-.195-.079.
- Correlation between interven and mloipe would
have to be less than -.079 to alter inference in
a combined sample of 9th and 10th graders. - TR(rxy 0) 1-r/rxy 1-(.071/.195).70
- More than 70 of the 9th graders would have to be
replaced with 10th graders for whom rxy 0 to
alter the inference in a combined sample.
18In terms of the Counterfactual
- Bias in rxy (Winship and Morgan, ARS, 1999)
- Difference in baseline Y between those who
received the treatment and those who received the
control -
- (1- who received the treatment )
- (Difference in treatment effect for the
treatment and control groups) - presence of confounding variables
-
- (1- who received the treatment )
- (non additive treatment effect)
19Interpreting TR(rxy0) in terms of the
Counterfactual
- TR(rxy 0) can be interpreted as the proportion
of bias in rxy that must occur such that an
unbiased estimate, including counterfactual data,
would not be statistically significant. - In the example more than 70 of rxy must be
attributed to bias such that an estimate from an
unbiased estimate would not be statistically
significant. - (assumes potential outcomes are not correlated.
increases if potential outcomes are correlated).
20Assumptions are the bridge between statistical
and causal inference
Assumptions
Statistical Inference
Causal Inference
Cornfield, J., Tukey, J. W. (1956, Dec.),
Average Values of Mean Squares in Factorials.
Annals of Mathematical Statistics, 27(No. 4),
907_949.
21In Donald Rubins words
- Nothing is wrong with making assumptions on
the contrary, such assumptions are the strands
that join the field of statistics to scientific
disciplines. The quality of these assumptions and
their precise explication, not their existence,
is the issue(Rubin, 2004, page 345).
22Conclusions
- Objections to moving from statistical to causal
inference in terms of violations of assumptions - No unobserved confounding variables
- Treatment has same effect for all
- Robustness indices quantify how much must
assumptions must be violated to alter inference. - No new causal inferences!
- robustness indices merely quantify terms of
debate regarding causal inferences. - Can be used with any threshold.
- Can be used (theoretically) for any t-ratio
- Extension of sensitivity indices are a property
of original estimate
23Extensions
- To Logistic Regression
- See Imbens, Guido Sensitivity to Exogeneity
Assumptions in Program Evaluation Recent
Advances in Econometric Methdology (126-132,
especially 128) - To multilevel models
- Can control for fixed effects
- With random effects, interpret t-ratios in terms
of relationships at level 1 or 2 (with Mike
Seltzer and Jin-ok Kim)
24References
- Frank, K. A. (2000), Impact of a confounding
variable on the inference of a regression
coefficient. Sociological Methods Research,
29(2), 147_194. - Frank (to be resubmitted to SMR) Indices for the
Robustness of Causal Inferences for the
Counterfactual. - Frank, K. and Min, K Indices of Robustness for
External Validity. - Holland, P. W. (1986), Statistics and causal
inference. Journal of the American Statistical
Association, 81, 945_970. - Rubin, D. B. (1974), Estimating causal effects of
treatments in randomized and non_randomized
studies. Journal of Educational Psychology, 66,
688_701. - Rubin, D.B. (2004). Teaching Statistical
Inference for Causal Effects in Experiments and
Observational Studies.Journal of Educational and
Behavioral Statistics, Vol 29(3) 343-368. - Winship, C., Morgan, S. (1999). The Estimation
of Causal Effects from Observational Data. Annual
Review of Sociology, 25, 659_707. - Winship, C. and Sobel, M. (2004) Causal
Inference in Sociological Studies. Chapter 21 in
Handbook of Data Analysis (Hardy, Melissa., and
Bryman, Alan, ed.). London Sage Publications. - On the Web
- http//www.wjh.harvard.edu/winship/cfa.html
(Winships portal) - http//www.ets.org/research/dload/AERA_2004-Hollan
d.pdf (recent Paul Holland) - http//bayes.cs.ucla.edu/jp_home.html (Judea
Pearl)