Indices of the Robustness of Causal Inferences from Statistical Analyses PowerPoint PPT Presentation

presentation player overlay
1 / 24
About This Presentation
Transcript and Presenter's Notes

Title: Indices of the Robustness of Causal Inferences from Statistical Analyses


1
Indices of the Robustness of Causal Inferences
from Statistical Analyses
  • Kenneth A Frank
  • Michigan State University
  • AERA Stat Institute

2
Motivating Question
  • Assume you have a significant result.
  • Q. But have you controlled for ...?
  • Inference could always be altered by
    unmeasured confound
  • Q. Is the causal mechanism consistent across
    sub-populations?
  • If not, do you really understand the causal
    mechanisms?

3
Initial Regression Results for Mloipe (Amount of
Intended Exposure to Math Content)
  • NAME beta SEbeta t
  • INTERVEN 11.521 2.064 5.581
  • (algebra 1 by 9th grade)
  • PREGPA -0.128 0.985 -0.130
  • PVT -0.036 0.065 -0.561
  • LGINC -0.760 1.122 -0.677
  • PARED6 0.363 0.661 0.548
  • FEMALE 1.548 1.617 0.957

4
Potential Confounding Variable
  • But those who took algebra I may have been more
    inclined to pursue higher level material anyway
    because of educational aspirations.

5
The Impact of a Confounding Variable on a
Regression Coefficient
6
What must be impact to alter Inference?
  • y outcome (mloipe)
  • x predictor of interest (on math track)
  • rxy the correlation between x (social capital)
    and y (use of computers).26
  • r x cv correlation between x and an
    unmeasured confounding variable
  • ry cv correlation between y and an unmeasured
    confounding variable
  • k r x cv ry cv Unmeasured Impact of
    Confound
  • Question
  • What must be k to alter the inference regarding
    math track on mloipe?

7
Defining a Threshold for Inference
  • Define r as the value of r that is just
    statistically significant

n is the sample size q is the number of
parameters estimated r can also be defined in
terms of effect sizes or correlation coefficients.
8
Defining the Threshold for Impact
Assuming rxcv rycv (which maximizes the impact
of the confounding variable)
Set rxycv r and solve for k to find the
threshold for the impact of a confounding
variable (TICV).
Thus the inference is altered if k is greater
than the TICV.
9
Multivariate Extension, with Covariates
krx cvz ry cvz Maximizing the impact
with covariates z in the model implies
And
10
Application to Math Track and MLOIPE
rinterven mloipez.195
TICV.129 including multivariate correction
11
What must be the Impact to Alter the Inference?
  • If k .129 then the inference would be altered.
  • If r x cv ry cv, then each would have to be
    greater than k1/2 .37 to alter the inference.
  • (multivariate correction, ry cv .38 and r x cv
    .34)
  • Furthermore, correlations must be partialled for
    covariates z.
  • Impacts of existing covariates on Math track are
    all less than .002.

12
Actual Impact of College Aspirations (CASP1)
  • Parameter Standard
  • Variable Estimate Error t Value
    Pr t
  • Intercept 82.01538 7.28308 11.26
  • interven 11.67616 2.21234 5.28
  • pregpa -0.39604 1.06669 -0.37
    0.7105
  • pvt -0.01879 0.06339 -0.30
    0.7670
  • lginc -0.54225 1.07824 -0.50
    0.6152
  • pared6 0.57486 0.67084 0.86
    0.3918
  • female 1.93583 1.62987 1.19
    0.2353
  • casp1 1.88547 0.89355 2.11
    0.0352
  • Impact of CASP1 on Intervention .093 .111.01.
  • Impact needed to be larger than .129 to alter the
    inference.
  • Inference retained.
  • What about the next unmeasured confound?
  • TICV(Interven).120, still a high threshold, at
    what point do we approach the confidence of a
    randomized experiment?

13
Impact threshold of a Suppressor with Covariates
Se r x negative and statistically significant
And
Note if r0, then TISVTICVrxyz impact
must equal original correlation to reduce
coefficient to zero.
14
Non-additive Effects
  • Q. Does the relationship apply in other
    populations?
  • What if we had analyzed a sample of 9th and 10th
    graders (at wave I) instead of just 9th graders?
  • What would relationship have to be in alternative
    sample to change inference from a combined
    sample?
  • What would the relationship between interven and
    Mloipe have to be among 10th graders such that if
    used to replace some 9th graders the inference
    would be altered?

15
Correlation for Observed and Unobserved Data
  • Define ? as the proportion of the sample that is
    replaced with an alternate sample.
  • r is correlation in unobserved data
  • R is combined correlation for observed and
    unobserved data
  • Rxy(1-?)rxy ?rxy .

16
Thresholds for Sample Replacement
  • Set Rr and solve for rxy
  • If half the sample is replaced (?.5), inference
    is altered if
  • rxy TR(?.5)
  • If rxy 0, inference is altered if
  • 1-r/rxy threshold for replacement TR(rxy0)
  • Assumes means and variances are constant across
    samples, alternative calculations available.

17
Example of Thresholds for Replacement
  • TR(?.5) 2r-rxy 2(.071)-.195-.079.
  • Correlation between interven and mloipe would
    have to be less than -.079 to alter inference in
    a combined sample of 9th and 10th graders.
  • TR(rxy 0) 1-r/rxy 1-(.071/.195).70
  • More than 70 of the 9th graders would have to be
    replaced with 10th graders for whom rxy 0 to
    alter the inference in a combined sample.

18
In terms of the Counterfactual
  • Bias in rxy (Winship and Morgan, ARS, 1999)
  • Difference in baseline Y between those who
    received the treatment and those who received the
    control
  • (1- who received the treatment )
  • (Difference in treatment effect for the
    treatment and control groups)
  • presence of confounding variables
  • (1- who received the treatment )
  • (non additive treatment effect)

19
Interpreting TR(rxy0) in terms of the
Counterfactual
  • TR(rxy 0) can be interpreted as the proportion
    of bias in rxy that must occur such that an
    unbiased estimate, including counterfactual data,
    would not be statistically significant.
  • In the example more than 70 of rxy must be
    attributed to bias such that an estimate from an
    unbiased estimate would not be statistically
    significant.
  • (assumes potential outcomes are not correlated.
    increases if potential outcomes are correlated).

20
Assumptions are the bridge between statistical
and causal inference
Assumptions
Statistical Inference
Causal Inference
Cornfield, J., Tukey, J. W. (1956, Dec.),
Average Values of Mean Squares in Factorials.
Annals of Mathematical Statistics, 27(No. 4),
907_949.
21
In Donald Rubins words
  • Nothing is wrong with making assumptions on
    the contrary, such assumptions are the strands
    that join the field of statistics to scientific
    disciplines. The quality of these assumptions and
    their precise explication, not their existence,
    is the issue(Rubin, 2004, page 345).

22
Conclusions
  • Objections to moving from statistical to causal
    inference in terms of violations of assumptions
  • No unobserved confounding variables
  • Treatment has same effect for all
  • Robustness indices quantify how much must
    assumptions must be violated to alter inference.
  • No new causal inferences!
  • robustness indices merely quantify terms of
    debate regarding causal inferences.
  • Can be used with any threshold.
  • Can be used (theoretically) for any t-ratio
  • Extension of sensitivity indices are a property
    of original estimate

23
Extensions
  • To Logistic Regression
  • See Imbens, Guido Sensitivity to Exogeneity
    Assumptions in Program Evaluation Recent
    Advances in Econometric Methdology (126-132,
    especially 128)
  • To multilevel models
  • Can control for fixed effects
  • With random effects, interpret t-ratios in terms
    of relationships at level 1 or 2 (with Mike
    Seltzer and Jin-ok Kim)

24
References
  • Frank, K. A. (2000), Impact of a confounding
    variable on the inference of a regression
    coefficient. Sociological Methods Research,
    29(2), 147_194.
  • Frank (to be resubmitted to SMR) Indices for the
    Robustness of Causal Inferences for the
    Counterfactual.
  • Frank, K. and Min, K Indices of Robustness for
    External Validity.
  • Holland, P. W. (1986), Statistics and causal
    inference. Journal of the American Statistical
    Association, 81, 945_970.
  • Rubin, D. B. (1974), Estimating causal effects of
    treatments in randomized and non_randomized
    studies. Journal of Educational Psychology, 66,
    688_701.
  • Rubin, D.B. (2004). Teaching Statistical
    Inference for Causal Effects in Experiments and
    Observational Studies.Journal of Educational and
    Behavioral Statistics, Vol 29(3) 343-368.
  • Winship, C., Morgan, S. (1999). The Estimation
    of Causal Effects from Observational Data. Annual
    Review of Sociology, 25, 659_707.
  • Winship, C. and Sobel, M. (2004) Causal
    Inference in Sociological Studies. Chapter 21 in
    Handbook of Data Analysis (Hardy, Melissa., and
    Bryman, Alan, ed.). London Sage Publications.
  • On the Web
  • http//www.wjh.harvard.edu/winship/cfa.html
    (Winships portal)
  • http//www.ets.org/research/dload/AERA_2004-Hollan
    d.pdf (recent Paul Holland)
  • http//bayes.cs.ucla.edu/jp_home.html (Judea
    Pearl)
Write a Comment
User Comments (0)
About PowerShow.com