Title: Addressing Confounding Errors When Using NonExperimental, Observational Data to Make Causal Inferenc
1Addressing Confounding Errors When Using
Non-Experimental, Observational Data to Make
Causal InferencesERROR06 Conference
- Pamela Jo Johnson Andrew Ward
- University of Minnesota
2Introduction
- The focus of the presentation is on the problem
of confounding bias in non-experimental,
observational studies in social epidemiology - J. Michael Oakes and Jay S. Kaufman define Social
Epidemiology as the study of how a societys
innumerable social arrangements, past and
present, yield differential exposures and thus
differences in health outcomes among the persons
who comprise the population. - We believe that warranted causal inferences using
non-experimental, observational data requires
careful and complete conceptual analysis for
this reason, there is much that philosophical
approaches can contribute.
3Non-Experimental Studies
- Many social phenomena are not amenable to
experimental investigation ethical and
complexity issues - Non-experimental studies may have random
selection/sampling from a target population
e.g., NHIS which is a four-panel, stratified,
multistage, cross-sectional household interview
survey in which sampling and surveys are
continuous throughout the year. - Non-experimental studies lack random assignment
the units being investigated (e.g., people) do
not each have a known probability of being
assigned to a (manipulable) treatment/exposure
whose effect we want to investigate - In contrast, random assignment (assuming ideal
experimental conditions following the random
assignment) can yield unbiased estimates of the
average treatment effect
4A Central Problem for Non-Experimental,
Observational Studies
- One of the central problems for non-
experimental, observational studies is to
control, or somehow take account of the possible
bias that occurs because there is no random
assignment of units (e.g., people) in a target
population to treatments/exposures. - What can we do to come closer to the goal of all
epidemiological studies an accurate estimation
of the true effect of a treatment/ exposure in
a target population?
5An Approach to the Problem
- Counterfactual Framework
- Causal Contrasts
- Propensity Score Matching
- Instrumental Variables
6Counterfactual Framework
- Counterfactual framework
- Framework for thinking about cause and effect
- Potential outcomes model (Potential outcomes of
differential exposures) - Compare the potential outcomes that would occur
under different levels of exposure for the same
unit (e.g. person) - Neyman-Rubin (Fisher, Holland, etc.) Model
- Consider two variables Yt(u) and Yc(u)
- Yt(u) the value of the response (Y) if the unit
(u) were exposed to t Yc(u) the value of the
response if the same unit were exposed to c - If we could simultaneously observe Yt(u) and
Yc(u), then the Causal Contrast, Yt(u) - Yc(u),
would tell us how much Y changed for unit u if
treatment/exposure t was used instead of
treatment/exposure c
7Fundamental Problem of Causal Inference - 1
- Given the Causal Contrast Yt(u) - Yc(u), Paul
- Holland (1986) writes
- Fundamental Problem of Causal Inference. It is
- impossible to observe the value of Yt(u) and
Yc(u) - on the same unit and, therefore, it is impossible
to - observe the effect of t on u.
- Put a bit differently (following Pearl, 2003),
whereas association has to do with static
relationships (joint distribution of observed
variable values) causation has to do with the
effect of changing variable values more
specifically, changing the variable values for
the same units during the same time period
(hence, the counterfactual character of causal
inference) -
8Fundamental Problem of Causal Inference - 2
- When data are non-experimental, they are not the
result of random assignment, and so the causal
inference depends on a theory of the way that the
data were generated, which goes beyond the data
themselves. (Pearl, 2003) - It follows from this that causal inferences using
only non-experimental data are not directly
testable without additional assumptions, the
most that we can directly test for are
correlations/associations, and not causal
relationships. - Thus, within a counterfactual framework, we need
to find an observable substitute for the
counterfactual which is identifiable with, and
exchangeable for the counterfactual.
9Causal Contrasts
Figure adapted from Maldonado Greenland.
(2002). Estimating causal effects. Int J
Epidemiol, 31(2), 422-429.
10Confounding and Confounders
- In the causal contrast scenario from the previous
slide, we say that the TARGET population
experiences exposure distribution 1 (i.e.,
exposure to poverty) and that the SUBSTITUTE for
the counterfactual target population experiences
exposure distribution 0 (i.e., no exposure to
poverty). - CONFOUNDING occurs just in case ELow Poverty/Flow
Poverty is not identical to (?) ALow Poverty/BLow
Poverty. When this happens, the substitute is an
imperfect substitute for the counterfactual
target population the exchangeability of the
substitute for the counterfactual target
population is imperfect because they are not
identical. Thus, confounding is a property of
the assignment mechanism. (Greenland, 1990) - COUNFOUNDER while confounding occurs because of
imperfect substitution, a confounder is a
variable that explains, partly or completely, why
confounding occurs.
11The Problem of Confounding
- We cannot eliminate confounding in
non-experimental studies by random assignment
(caveat of natural experiments) In contrast,
in experimental studies one can make the
probability of severe confounding as small as
preferred by increasing the sample size
(Greenland, 1990) - At the same time, within the counterfactual
framework we need an appropriate observable
substitute for the counterfactual target
population that cannot be directly observed - Thus, we need a different way to provide some
assurance that the observable substitute we
select for the counterfactual target population
is as closely exchangeable with the
counterfactual as possible - In other words, we need something that permits us
to mimic random assignment in experimental
studies - This is what leads to a consideration of
propensity scores
12What are Propensity Scores?
- Paul Rosenbaum and Donald Rubin (1983) define a
propensity score as the conditional probability
of assignment to a particular treatment given a
vector of observed covariates. - Propensity scores range from 0 to 1 in a
randomized experiment, an equal probability
assignment mechanism assigns people to one of two
(or more) distributions of treatment, so each
person will have a true propensity score of .5
In a non-experimental, observational study,
propensity scores must be estimated. - Two assumptions that are made are (1) Stable
Unit-Treatment Value Assumption There is a
unique value rti corresponding to unit i and
treatment t. (2) Strongly Ignorable Treatment
Assignment the responses, rti, are conditionally
independent of the treatment assignment, t, given
the observed covariates, and for each covariate
the subjects have a positive probability of
receiving the treatment.
13Generating Propensity Scores
- Propensity scores can be estimated using several
methods, but the most commonly used method is
logistic regression. - The regression uses observable covariates, and
following Rubin and Thomas (1996) unless a
variable can be excluded because there is a
consensus that it is related to outcome or is not
a proper covariate, it is advisable to include it
in the propensity score model even if it is not
statistically significant. Thus, Propensity
Scores are Covariate -Promiscuous. - When logistic regression is used, the observed
covariates are the predictors and the treatment
assignment (dummy coded 0No Treatment/exposure,
1treatment/exposure) is used as the dependent
variable. - The predicted value (probability) is the
propensity score and each person in the target
population will end up with a propensity score,
unless they have missing values on covariates.
14Propensity Score Overlap - 1
15Propensity Score Overlap - 2
16Propensity Scores An Example of NO Overlap
17General Steps of the Propensity Score Methodology
- Estimate propensity scores for each causal
contrast using logistic regression - Assess overlap in propensity scores across
exposure groups - Match exposed to unexposed (counterfactual)
subjects on propensity scores within calipers
(i.e., a predetermined range of the exposed
subjects estimated propensity score) - Assess covariate balance across exposure groups
(e.g., using standardized differences in the
distribution of covariates across the exposure
groups, where what is wanted is standardized
differences lt10) - Estimate average causal effects from matched
samples (Average Effect of the Treatment/Exposure
on the Treated/Exposed) - Bootstrap standard errors and confidence intervals
18Propensity Score Matching
- Propensity score estimation
- Use of propensity scores reduces a collection of
covariates to a single summary measure, which is
conducive to matching - ? Propensity score overlap (if there is no
overlap the groups are not comparable, and so the
subjects not exchangeable) - If covariates are missing (missing data on the
propensity score predictors), a propensity score
cannot be calculated. - Matching on propensity scores
- For example, match two infants with the same
probability of exposure when in fact one was
exposed and the other was not (we can match with
replacement to handle cases of exposed/treated ?
non-exposed/non-treated) - Matching on propensity scores will, in
expectation, create balance on all covariates
used to estimate it - ? The result is covariate balance across groups
after matching (the observed concomitants that
might affect the response are as similar as
possible in the two groups)
19Limitations of Propensity Scores - 1
- Data Source Limitations (limitations of the
example) - Linked Birth/Infant Death data
- Not collected for research purposes
- No data on individual/family poverty status
- No data on whether infant ever left the hospital
- Thus, good data collection techniques are
needed - Propensity Score Methodology Limitations
- Common support may induce selection bias (i.e.,
excluding those with propensity scores on the
tails for whom there is no overlap) - Excluding subjects not having all the observed
covariates used in forming the propensity scores
can induce selection bias (Case-wise deletion due
to missing covariates is NOT unique to Propensity
Score Estimation) - Matching with replacement can result in a single
unexposed subject matched multiple times
20Limitations of Propensity Scores - 2
- In addition to the Methodological Limitations
already noted, recall two important assumptions
of propensity score analyses - (1) Stable Unit-Treatment Value Assumption
(SUTVA) - (2) Strong Strongly Ignorable Treatment
Assignment - The first assumption amounts to ignoring cases in
which there are dynamic interaction effects
between the covariates. Although SUTVA is
implausible for most of the cases in which social
epidemiologists are interested, it is not clear
how to address violations of SUTVA (which seems
to lead to computational intractability (Little
and Rubin, 2000)) and little research about this
has been done (though see (Blume and Durlauf,
2006) for suggestive ideas) - The second assumption means that using propensity
score matching does not account for bias due to
UNOBSERVED covariates. However, structural
equation modeling can be used to supplement
propensity score matching.
21Structural Equation Models - 1
- Once the propensity score matching is done, a
logistic model is created in which the dependent
variable, Y, is the outcome of interest (e.g.,
mortality good health) - Thus, what you really have (simplifying for the
purposes of exposition), post-propensity score
matching, is the equation - In this equation, stands for the vector of
covariates used in determining the propensity
scores (including the dummy treatment variable),
Y is the outcome variable of interest (e.g.,
mortality), and ? is the random error term. - The problem is that using propensity scores
assumes that the observed covariates used to
calculate propensity scores are sufficient to
isolate (identify) the dependent variable in
the equation. However, there may be unobserved
confounding variables.
22Structural Equation Models - 2
- Although there are several forms the influence of
unobserved variables could take, suppose we focus
on the case where there is a single omitted
common cause of X and Y in the previous
equation, where the estimate of the effect, Y,
captures both the effect of X and the effect of
the unobserved variable Graphically this can be
represented as
23Structural Equation Models - 3
- Failure to take account of the unobserved
variable will result in a confounding bias (the
unobserved variable is a confounder).
Statisticians refer to this type of bias as
spurious correlation. - More specifically, the bias occurs because there
is an unobserved variable that affects
treatment/exposure assignment. This will in turn
affect the creation of the treatment/exposure
group and the substitute of the counterfactual
group. This is precisely the kind of bias that
the use of matched propensity scores was intended
to eliminate amongst the observed covariates.
Now, however, it recurs because of the (possible)
presence of an unobserved covariate. - Structural equations (specifically, Instrumental
Variables Estimation) provides a way of dealing
with this.
24Structural Equation Models - 4
- What we want is an instrument (an instrumental
variable an IV) that satisfies the following
requirements - (a) The instrumental variable is independent of
the confounding variable (the unobserved
variable) - (b) The instrumental variable affects the
assignment into the treatment/exposure group
versus the non-treatment/non-exposure group in
other words, the instrumental variable is
associated with the assignment variable, (we will
refer to that variable as V). - (c) The instrumental variable satisfies (a) and
(b) without affecting either Yt(u) or Yc(u) that
is to say, the instrumental variable is
independent of Y given V and the confounding
variable.
25Structural Equation Models - 5
- We can represent, graphically, the idea of an
instrumental variable (where V stands for the
treatment variable, X the set of covariates used
in the propensity score determination, Y stands
for the outcome variable, and Z stands for the
instrumental variable) as
26Structural Equation Models - 6
- Suppose (for the sake of simplicity) that there
is a single instrumental variable, Z, a single
treatment variable, V, that SUTVA and
Monotonicity (same direction of effect) hold, and
that there is a nonzero causal effect of Z on V. - Typically, Instrumental Variable Estimation
(IVE) is used in the case of linear models.
However, because we used logistic regression to
regress X (containing V) on Y, and because linear
IVE produces consistent estimates only if the
endogenous regression is linear, we need to use
Non-linear IVE. - This requires that the endogenous treatment
regressor (by assumption, there is only one) is
replaced in the estimator-defining equation by
the appropriate non-linear instrument (that is,
the instrument is the dependent variable).
27Structural Equation Models - 7
- Because, by assumption, there is only one
instrument per treatment variable, V, then the
model is just (as opposed to over-) identified,
and the value of (the non-linear, Generalized
Method of Moments estimate of ) calculated.
Finally, this permits us to use the estimated
value to calculate the estimated average effect
of the treatment/exposure on the treated/exposed.
(In effect, a non-linear extension of 2SLS
though matters are more complicated when there
are multiple instruments) - What do instrumental variables show?
- If well chosen, the instrumental variable
provides a way of accounting for confounding due
to unobserved variables. Moreover, using GMM
estimates in association with IVE permits their
use with Propensity Score methods that also use
logistic regression.
28Structural Equation Models - 8
- Problems with instrumental variable approach
- One of the most difficult issues for using
instrumental variables is the selection of the
instrument(s) . Following (Moffitt, 2003), we can
identify four types that have been used in a
number of different applications
environmental/ecological, demographic, twin and
sibling, natural experiments - In connection with the first point, if the
instrumental variable is associated with other
error terms or with other unmeasured confounders,
use of the selected instrumental variable could
increase confounding. - IV corrections are large-sample corrections, and
so standard errors can be large without large
samples (standard errors can also be large if the
instrument is weak). - The IV approach is more common when the model of
interest is linear. When the model is non-linear
(e.g., logistic), estimation can be more
difficult.
29Final Conclusions and Remarks
- Propensity score matching only balances observed
covariates. It does not directly address the
case of non-observed covariates that are causally
relevant. It is for this reason that propensity
score balancing is combined with instrumental
variables estimation. - Causal inferences using non-experimental,
observational data depends on a number of
assumptions this goes back to Pearls
observation that causal inference depends on a
theory of the way that the data were generated,
which goes beyond the data themselves. (Pearl,
2003) - To reiterate a our starting claim, warranted
causal inferences using non-experimental,
observational data requires careful and complete
conceptual analysis for this reason, there is
much that philosophical approaches can contribute.