SAMSI Tutorial on Causal Inference - PowerPoint PPT Presentation

1 / 118
About This Presentation
Title:

SAMSI Tutorial on Causal Inference

Description:

To estimate the causal effect of variable A on variable Y is a ... Robins (1986) Effects of time-varying exposures in randomized and observational studies ... – PowerPoint PPT presentation

Number of Views:195
Avg rating:3.0/5.0
Slides: 119
Provided by: miguel1
Category:

less

Transcript and Presenter's Notes

Title: SAMSI Tutorial on Causal Inference


1
SAMSI Tutorial onCausal Inference
  • Miguel A. Hernán
  • Department of Epidemiology
  • Harvard School of Public Health
  • www.hsph.harvard.edu/causal

2
Causal inferencea central task of science
  • To estimate the causal effect of variable A on
    variable Y is a common goal of the sciences
  • Physics, chemistry, biology
  • Experiments and observations
  • Epidemiology, economics, sociology
  • Mostly observations

3
This tutorial covers
  • The conditions required for causal inference
  • The methods for causal inference under those
    conditions
  • Using a language and conceptual framework that is
    common to all sciences involved in causal
    inference from observational data

4
This tutorial does not cover
  • whether the conditions required for causal
    inference are met in a particular case
  • That is a subject-matter issue
  • Expert knowledge needed for causal inference

5
Causal inference in 2.5 hours?Need to focus
  • Low-dimensional data
  • Fixed (non time-varying) exposures
  • Nonparametric analysis (not models)
  • High-dimensional data
  • Time-varying exposures
  • Dynamic and nondynamic regimes
  • Parametric/semiparametric models
  • Well discuss
  • low-dimensional data only
  • no sampling variability
  • If X1.6 and Y1.7 then X not equal to Y
  • Equivalent to assuming that we work with the
    whole population (or a huge sample size)

6
Outline
  • Definition of causal effect
  • Estimation of causal effects in randomized
    experiments
  • Estimation of causal effects in observational
    studies

7
An intuitive definition of cause
  • Ian took the pill on Sept 1, 2003
  • Five days later, he died
  • Had Ian not taken the pill on Sept 1, 2003 (all
    others things being equal)
  • Five days later, he would have been alive
  • Did the pill cause Ians death?

8
An intuitive definition of cause
  • Jim didnt take the pill on Sept 1, 2002
  • Five days later, he was alive
  • Had Jim taken the pill on Sept 1, 2002 (all
    others things being equal)
  • Five days later, he would have been alive
  • Did the pill cause Jims survival?

9
Human reasoning for causal inference
  • We compare (often only mentally)
  • the outcome when action A is present with
  • the outcome when action A is absent
  • all other things being equal
  • If the two outcomes differ, we say that the
    action A has a causal effect
  • causative or preventive
  • In epidemiology, A is commonly referred to as
    exposure or treatment

10
Notation for actual data
  • Y1 if patient died, 0 otherwise
  • Yi1, Yj0
  • A1 if patient treated, 0 otherwise
  • Ai1, Aj0

11
Notation for ideal data
  • Ya01 ?if subject had not taken the pill, he
    would have died
  • Yi, a0 0, Yj, a0 0
  • Ya11 ?if subject had taken the pill, he would
    have died
  • Yi, a1 1, Yj, a1 0

12
Clarification
  • Upper-case letters for random variables
  • A, Y, Ya0 , Ya1
  • Lower-case letters for possible values
    (realizations) of those variables
  • a is a possible value (0 or 1) of the random
    variable A
  • For our purposes, random variables are variables
    with different values for different individuals

13
(Individual) Causal effect
  • For Ian
  • Pill has a causal effect because
  • For Jim
  • Pill does not have a causal effect because
  • Sharp causal null hypothesis holds if, for all
    subjects,

14
Potential or counterfactual outcomes
  • Ya0 and Ya1
  • Random variables
  • Amenable to mathematical treatment, e.g.,
    statistical models
  • One of them is the subject's potential outcome
    that would have been observed under an exposure
    value that the subject did not actually
    experience
  • Refers to a counter to the fact situation

15
Consistency
  • Key assumption
  • One of the counterfactual outcomes is the
    subject's actual outcome under the exposure value
    that the subject actually experienced
  • Refers to an observed (factual) situation
  • If Aia then Yi, a Yi, A Yi
  • Under consistency, a potential outcome Ya is
    factual for some subjects and counterfactual for
    others

16
Available data set
17
Fundamental problem of causal inference
  • Individual causal effects cannot be determined
  • except under extremely strong (and generally
    unreasonable) assumptions
  • because only one counterfactual outcome is
    observed
  • Causal inference as a missing data problem
  • Whether using a randomized experiment or an
    observational study
  • Need another definition of causal effect that
    requires weaker assumptions

18
First, more notation
  • PrYa1
  • proportion of subjects that would have developed
    the outcome Y had all subjects in the population
    of interest received exposure value a
  • (Counterfactual) Risk of Ya
  • Unconditional or marginal probability
  • Calculated by using data from the whole
    population

19
(Population) Causal effect
  • In the population, exposure A has a causal effect
    on the outcome Y if
  • Causal null hypothesis holds if PrYa11
    PrYa01

20
Equivalent representations of the causal null
hypothesis
  • PrYa11 - PrYa01 0
  • PrYa11 / PrYa01 1
  • (PrYa11/PrYa10)/(PrYa01/PrYa00)
    1
  • Causal effect can be measured in many scales
  • causal risk difference, causal risk ratio, causal
    odds ratio,
  • Effect measures

21
Individual versus populationcausal effects
  • Individual causal effects cannot be determined
  • except under quite restrictive assumptions
  • Population causal effects can be determined under
  • no assumptions (ideal randomized studies)
  • strong assumptions (observational studies)
  • Well refer to population causal effects only

22
Association and causationMore notation
  • PrY1Aa
  • proportion of subjects that developed the outcome
    Y among those who received exposure value a in
    the population
  • Risk of Y among the exposed/unexposed
  • Conditional probability
  • Calculated by using data from a subset of the
    population

23
Association
  • The exposure A and the outcome Y are associated
    if
  • No association independence

24
Equivalent representations of independence
  • PrY1A1 - PrY1A0 0
  • PrY1A1 / PrY1A0 1
  • (PrY1A1/PrY0A1) / (PrY1A0/PrY0A
    0) 1
  • Association can be measured in many scales
  • Associational risk difference, associational risk
    ratio, associational odds ratio,
  • Association measures

25
Again, crucial difference Association is not
causation
  • Association different risk in two disjoint
    subsets of the population determined by the
    subjects' actual exposure value
  • PrY1Aa is the risk in subjects of the
    population that meet the condition having
    actually received exposure level a
  • Causation different risk in the entire
    population under two exposure values
  • PrYa1 is the risk in all subjects of the
    population had they received the counterfactual
    exposure level a

26
(No Transcript)
27
An example of causal concept Confounding
  • There is confounding when association is not
    causation
  • Confounding cannot be defined using associational
    (statistical) language
  • More about confounding later

28
Generalizations of counterfactual theory
  • Causal effects in a subset of the population
  • Non dichotomous outcome and exposure
  • Non deterministic counterfactual outcomes
  • Interference
  • Time-varying exposures

29
Counterfactual theory can be generalized
regarding a) d)
  • But no major conceptual insights are gained from
    these generalizations
  • For pedagogic reasons, we will stick to the
    simplified version
  • Causal effect in the entire population
  • Dichotomous variables
  • Deterministic counterfactuals
  • No interference

30
e) Time-varying exposures Counterfactual
theories
  • Neyman (1923)
  • Effects of point exposures in randomized
    experiments
  • Rubin (1974)
  • Effects of point exposures in randomized and
    observational studies
  • Robins (1986)
  • Effects of time-varying exposures in randomized
    and observational studies

31
Outline
  • Definition of causal effect
  • Estimation of causal effects in randomized
    experiments
  • Estimation of causal effects in observational
    studies

32
Association measures
  • The associational risk ratio
  • PrY1A1/PrY1A0
  • can be directly computed in any study
  • because Y is observed in all subjects of the
    population
  • PrY1A1 and PrY1A0 are observed risks

33
Effect measures
  • The causal risk ratio
  • PrYa11 / PrYa01
  • cannot be directly computed (in general)
  • because Ya1 and Ya0 are unobserved in some
    subjects of the population
  • PrYa11 and PrYa01 are unobserved risks

34
Effect measures
  • can be computed using data from ideal randomized
    experiments
  • with no assumptions
  • (More rigorously, effect measures can be
    consistently estimated using data from ideal
    randomized experiments)
  • For now lets consider experiments with
    near-infinite sample sizes only

35
What is an ideal randomized experiment?
  • No loss to follow-up
  • Full compliance with (adherence to) assigned
    exposure or treatment
  • Double blind assignment

36
In ideal randomized experiments
  • PrYa11 is equal to PrY1A1
  • PrYa01 is equal to PrY1A0
  • Therefore the associational RR
  • PrY1A1 / PrY1A0
  • is equal to the causal RR
  • PrYa11 / PrYa01
  • Well prove this equality holds because
  • experimental treatment assignment ensures
    consistency
  • randomization ensures exchangeability

37
Experimental treatment assignment
  • One (near-infinite) population
  • Divided into two groups
  • Group 1 and group 2
  • One group is treated and the other untreated

38
Consistency
  • The counterfactual outcome under treatment of the
    treated is their observed outcome
  • The counterfactual outcome under no treatment of
    the untreated is their observed outcome
  • The counterfactual outcomes are consistent with
    the observed outcomes
  • Consistency is a consequence of experimental
    treatment assignment

39
Randomization (I)
  • Membership in each group (1 or 2) is randomly
    assigned
  • e.g., by the flip of a coin
  • First option
  • Treat subjects in group 1, dont treat subjects
    in group 2
  • The risk is, say, PrY1A1 0.57
  • Second option
  • Treat subjects in group 2, dont treat subjects
    in group 1
  • What is the value of the risk PrY1A1 ?

40
Randomization (II)
  • When group membership is randomly assigned,
    results are the same
  • whether group 1 treated, group 2 untreated
  • or vice versa
  • Both groups are comparable or exchangeable
  • Exchangeability is a consequence of randomization

41
Exchangeability
  • Subjects in group 1 would have had the same risk
    as those in group 2 had they received the
    treatment of those in group 2
  • The counterfactual risk in the treated equals the
    counterfactual risk in the untreated

42
Formal definition of exchangeability
  • Exchangeability implies lack of confounding
  • Exchangeability is another causal concept that
    cannot be represented by associational
    (statistical) language

for all a
43
ProofWhy PrY1Aa PrYa1?
  • Two steps
  • PrY1Aa PrYa1Aa
  • by consistency
  • PrYa1Aa PrYa1
  • by exchangeability
  • Consistency and exchangeability ensured in ideal
    randomized studies

44
In an ideal randomized experiment
  • Association is causation because
  • experimental treatment assignment produces
    consistency
  • randomization produces exchangeability
  • We have a method for causal inference!
  • No need for adjustments of any sort
  • Assumption-free

45
Example Does heart transplant (A) increase
5-year survival (Y)?
  • Select a large population of potential recipients
    of a transplant
  • Get funding and IRB/Ethical approval
  • Randomly allocate them to either transplant (A1)
    or medical treatment (A0)
  • 5 years later, compute the associational RR
    PrY1A1 / PrY1A0
  • that equals (cons. estimates) the causal RR
    PrYa11 / PrYa01

46
Potential problems ofreal randomized experiments
  • Loss to follow-up
  • Noncompliance
  • Unblinding
  • Other ethics, feasibility, cost

47
Consequence of problems a), b), c)
  • Although exchangeability still holds in
    randomized experiments but
  • available association may not be causation
    (loss to follow-up)
  • exposure is misclassified (non compliance) or
    contaminated (unblinding)
  • Causal inference from real randomized studies may
    require assumptions and analytic methods similar
    to those for causal inference from observational
    studies

48
Conclusion
  • No clear-cut separation between randomized and
    observational studies
  • Observational studies are needed
  • In fact, most of human knowledge comes from
    observations, e.g., evolution theory, tectonic
    plaques theory, hot coffee may cause burns
  • And so are methods for causal inference from
    observational data

49
Outline
  • Definition of causal effect
  • Estimation of causal effects in randomized
    experiments
  • Estimation of causal effects in observational
    studies

50
Conditions for causal inferenceconsistency and
exchangeability
  • In ideal randomized experiments association is
    causation because
  • experimental treatment assignment produces
    consistency
  • randomization produces exchangeability
  • But ideal randomized experiments are rare
  • We need observational studies
  • No experimental treatment assignment
    Consistency?
  • No randomization Exchangeability?

51
Consistency in observational studies
  • If no consistency, then counterfactuals are not
    well defined
  • If counterfactuals are not well defined then
    causal effects are not well defined
  • Can consistency not hold?
  • Lets see some examples of exposures in
    observational studies

52
SEX?
  • Certain chronic disease occurs more frequently in
    women (S1) than in men (S0)
  • PrY1S1 gt PrY1S0
  • Does sex has a causal effect on the risk of
    disease?
  • PrYs11 / PrYs01

53
Quite vague question
  • The causal question (in English) would be
    something like
  • What would have been the risk had everybody been
    of female sex compared with
  • Wait a second, what do we mean by female sex?
  • carrying a pair of X chromosomes
  • having been brought up as a woman
  • high levels of estrogens between adolescence and
    menopause
  • ...?

54
Another vague question
  • What is the effect of obesity on mortality?
  • Compare what would happen if everybody were
    obese
  • If you dont think the question is vague, try to
    describe an experiment to replicate the results
    from an observational study on obesity and
    mortality
  • design a randomized experiment in which
    participants are randomly assigned to either
    obesity or non obesity
  • assume unlimited resources and no ethical
    constraints

55
Key question what is the appropriate
intervention?
  • Many possible interventions could be used to
    assign participants to the non-obese group
  • Extreme exercise, starvation, surgery, genetic
    modification
  • Each may lead to a different outcome even if,
    when counterfactually applied to a given
    individual, they all would produce identical body
    weight
  • The counterfactual outcome under each exposure
    level is not well defined because the value of
    the counterfactual outcome may depend on the
    intervention used to manipulate the exposure

56
Lack of consistency leads to ill-defined causal
effects
  • Not a consequence of
  • ethical constraints
  • e.g., the effect of cigarette smoking can be well
    defined even if some of the hypothetical
    interventions involved are harmful
  • unfeasible interventions
  • e.g., the effect of long-term diet can be well
    defined even if some of the hypothetical
    interventions involved are impractical

57
Conclusion Being able to utter X does not entail
that X has a meaning
  • (The wishes of my running shoes?)
  • In observational studies, to give an unambiguous
    meaning to a causal question, we need to be able
    to describe a hypothetical intervention
  • walk 2 hours/day 7 days a week, eat 2000
    calories/day versus walk 0.5 hours, eat 3000
    calories/day
  • For some questions we have a common understanding
    of the intervention
  • Effect of heart transplant

58
A benefit of a formal definition of causal effects
  • Some interventions sound technically unfeasible
    (or plainly crazy) because formulating certain
    causal questions is not straightforward
  • A counterfactual approach to causal inference
    highlights the imprecision/ vagueness of some
    causal questions
  • Effect of age, HDL-cholesterol, HIV viral load,
    socioeconomic status,
  • Formulating an appropriate causal question is a
    subject-matter issue

59
Conditions for causal inferenceconsistency and
exchangeability
  • Well assume consistency holds in observational
    studies
  • That is, well assume exposures are well defined
  • What about exchangeability in observational
    studies?
  • First lets review exchangeability in ideal
    randomized studies

60
Exchangeability in ideal randomized experiments
  • Exchangeability is guaranteed
  • PrYa11 is equal to PrY1A1
  • PrYa01 is equal to PrY1A0
  • Therefore the associational risk ratio
  • PrY1A1 / PrY1A0
  • is equal to the causal risk ratio
  • PrYa11 / PrYa01

for all a
61
But even in ideal randomized experiments
  • The equality between crude association measure
    and effect measure does not always hold
  • Association risk ratio not necessarily equal to
    causal risk ratio
  • Need to take into account the design of the
    experiment
  • Was randomization conditional or unconditional?
  • So far we have considered unconditional
    (marginal) randomization only

62
Consider the following ideal randomized experiment
  • 2 million subjects with heart disease
  • numbers divided by 100,000 in example
  • Variables
  • A1 heart transplant (exposure)
  • Y1 death (outcome)
  • L1 critical condition (prognostic factor)
  • Goal to estimate the effect of A on Y
  • in the risk ratio scale

63
The data summarized in a table
64
The data summarized in a tree
65
(No Transcript)
66
Two designs could have produced these data
  • Design 1
  • randomly select and expose 65 of the individuals
  • unconditional randomization probability
  • Design 2
  • classify individuals as in critical (L1) or
    noncritical (L0) condition
  • Randomly select 75 of L1 and 50 of L0, and
    expose the selected individuals
  • randomization probabilities depend (are
    conditional) on the value of L

67
Implications of each design
  • Design 1
  • Exchangeability is guaranteed
  • Design 2
  • Greater proportion of L1 in the exposed
  • No exchangeability
  • But Design 2 is simply the combination of two
    Design 1 studies one in L1 and another one in
    L0
  • Exchangeability is guaranteed in each Design 1
    substudy (i.e., conditional on L)

68
Design 2Conditional exchangeability
  • Within levels of L, exposed subjects would have
    had the same risk as unexposed subjects had they
    being unexposed, and vice versa
  • Counterfactual risk is the same in the exposed
    and the unexposed with the same value of L

69
Formal definition of conditional exchangeability
for all a
  • Conditional exchangeability is equivalent to
    randomization within levels of L
  • It implies no confounding within levels of the
    variable L

70
Data analysis Design 1 Exchangeability
  • Counterfactual risk observed risk
  • PrYa1 PrY1Aa
  • Causal risk ratio assoc. risk ratio
  • PrYa11 / PrYa01 (7/13)/(3/7) 1.26
  • But 69 exposed versus 43 unexposed were in
    critical condition
  • Exchangeability does not hold
  • Data not generated under Design 1

71
Data analysis Design 2 Conditional
exchangeability
  • Conditional counterfactual risk conditional
    observed risk
  • PrYa1L0 PrY1Aa, L0
  • PrYa1L1 PrY1Aa, L1
  • Counterfactual risk in the population is the
    weighted average of the stratum-specific
    counterfactual risks
  • From basic probability theory marginal
    probability is the weighted average of the
    conditional probabilities

72
ProofPrY1Ll, Aa PrYa1Ll
  • Two steps
  • PrY1Ll, Aa PrYa1Ll, Aa
  • by consistency
  • PrYa1Ll, Aa PrYa1Ll
  • by conditional exchangeability
  • Therefore

73
Data analysis Design 2 Conditional
exchangeability
  • PrYa11 (1/4)0.4(2/3)0.6 0.5
  • PrYa01 (1/4)0.4(2/3)0.6 0.5
  • PrYa11 / PrYa01 0.5/0.5 1
  • Whats the name of this method?
  • Standardization

I
74
Standardization as a simulation
  • Standardization is the equivalent of simulating
    what would happen in the study population if
  • everybody had received certain exposure level a,
    and
  • the distribution of the covariate L were the same
    as its distribution in the standard population

75
In summary, in Design 1-2 randomized experiments
  • Randomization produces exchangeability (design 1)
    or conditional exchangeability (design 2)
  • In both cases, the causal effect can be easily
    calculated
  • Design 1 Crude association measure
  • Design 2 Standardized association measure
  • the g-formula for fixed exposure

76
What does this have to do with observational
studies?
  • Meet Design 3
  • Investigators do not intervene in the assignment
    of hearts but rather they observe which
    individuals happen to receive them
  • The data they observe are

77
(No Transcript)
78
In a Design 3 (observational) study
  • Absence of randomization implies that
    exchangeability is not guaranteed
  • In general,
  • PrYa11 is not equal to PrY1A1
  • PrYa01 is not equal to PrY1A0
  • Therefore the associational RR
  • PrY1A1 / PrY1A0
  • is not generally equal to the causal RR
  • PrYa11 / PrYa01

79
In a Design 3 (observational) study
  • Exchangeability is too strong an assumption!
  • The exposed and the unexposed are not generally
    comparable
  • e.g., individuals who receive a heart transplant
    may have a more severe disease than those who do
    not receive it
  • In general,

80
In a Design 3 (observational) study
  • The investigators may believe that the exposed
    and the unexposed are exchangeable within levels
    of L
  • had exposed patients in critical condition stayed
    unexposed, they would have had the same mortality
    risk as those in critical condition who actually
    stayed unexposed (and vice versa)
  • And similarly for patients in noncritical
    condition
  • That is, the investigators may be willing to
    assume conditional exchangeability

81
Is this a reasonable assumption in observational
studies?
  • Consider only individuals with the same
    pre-exposure prognostic factors
  • Then the exposed and the unexposed may be
    exchangeable
  • e.g., among individuals with an ejection fraction
    of 40, those who do and do not receive a heart
    transplant may be comparable
  • e.g., among individuals with CD4 countlt100, those
    who do and do not receive antiretroviral therapy
    may be comparable
  • This is sometimes reasonable
  • Especially if conditioning on many pre-exposure
    covariates L

82
The randomized experiment paradigm for
observational studies
  • An observational study (design 3) can be viewed
    as a randomized experiment (design 2) in which
  • the conditional probabilities of exposure are not
    chosen by the investigators
  • but can be estimated from the data
  • conditional exchangeability is not guaranteed
  • but only assumed based on the investigators
    expert knowledge

83
In a Design 3 (observational) study,conditional
exchangeability
  • Necessary condition for causal inference
  • In fact, the weakest condition for causal
    inference
  • Under conditional exchangeability in all strata
    Ll, we can compute (consistently estimate) the
    causal risk ratio
  • PrYa11 / PrYa01
  • Using standardization (g-formula)

84
In summary, in a Design 3 (observational) study
  • Causal effects can be calculated
  • under the assumption of conditional
    exchangeability within levels of the covariates
  • We have a method for causal inference from
    observational data that it is not assumption-free
  • But the need to rely on this assumption is not
    THE problem

85
Can we check whether conditional exchangeability
holds?
  • Nope
  • This is THE problem
  • The assumption of conditional exchangeability is
    untestable
  • Even if there is conditional exchangeability,
    there is no way we can know it with certainty

86
RememberConditional exchangeability
for all a
  • Conditional exchangeability is equivalent to
    randomization within levels of L
  • It implies no unmeasured confounding within
    levels of the measured variables L
  • Data necessary to test this condition is, by
    definition, unavailable

87
Thats why causal inference from observational
data is controversial
  • Expert knowledge can be used to enhance the
    plausibility of the assumption
  • measure as many relevant pre-exposure covariates
    as possible
  • Then one can only hope the assumption of
    conditional exchangeability is approximately true
  • (All we are saying is that there may be
    confounding due to unmeasured factors)

88
Identifiability and confounding
  • Design 1 exchangeability guaranteed
  • effect measures computed with A,Y data
  • the causal effect is identifiable given A,Y data
  • no confounding
  • Design 2 conditional exchangeability guaranteed
  • effect measures computed with L,A,Y data
  • the causal effect is identifiable given L,A,Y
    data
  • no unmeasured confounding
  • Design 3 conditional exchangeability assumed
  • effect measures computed with data L,A,Y?
  • the causal effect is not identifiable given data
    only
  • no unmeasured confounding?

89
Outline
  • Definition of causal effect
  • Estimation of causal effects in randomized
    experiments
  • Estimation of causal effects in observational
    studies
  • Standardization (g-formula)
  • Inverse probability weighting

90
Under conditional exchangeability
  • One can estimate causal effects in Design 2
    (randomized) and Design 3 (observational) studies
    by using standardization
  • We now describe another method to estimate causal
    effects inverse probability weighting (IPW)

91
IPW plan of action
  • YOU will compute the causal risk ratio using IPW
    in an observational study
  • i.e., you will compute PrYa11/PrYa01
  • under conditional exchangeability
  • We will prove that you were right

92
A simplified observational study
  • 2 million subjects with heart disease
  • Divided by 100,000 in numerical example
  • Variables
  • A1 heart transplant (exposure)
  • Y1 death (outcome)
  • L1 critical condition (prognostic factor)
  • Goal to estimate the effect of A on Y
  • on the risk ratio scale

93
The data summarized in a table
94
The data summarized in a tree
95
Your goal
  • To compute the effect of a heart transplant on
    the risk of death using the causal risk ratio
    scale
  • PrYa11 / PrYa01
  • Assuming conditional exchangeability within
    levels of L
  • First, compute PrYa01
  • Second, compute PrYa11

96
(No Transcript)
97
(No Transcript)
98
(No Transcript)
99
(No Transcript)
100
Data analysis in the pseudo-population
  • PrYa11 10 / 20 0.5
  • PrYa01 10 / 20 0.5
  • Causal risk ratio 0.5 / 0.5 1

101
Which assumption are you making?
for all a
  • Conditional exchangeability in the population
  • exposure is randomized within levels of L
  • no unmeasured confounding within levels of the
    measured variable L
  • Within levels of L, the risk among the exposed if
    unexposed is the same as the risk among the
    unexposed in the population
  • and vice versa

102
Under conditional exchangeability
  • The observational study in the original
    population is a randomized experiment within
    levels of L
  • The study in the pseudo-population created by IPW
    is a randomized experiment
  • Exposed and unexposed subjects are
    (unconditionally!) exchangeable
  • Because they are the same individuals
  • Exposure is randomized, i.e., equally probable
    across levels of the covariate L
  • There is no confounding
  • In the pseudo-population, causal effects can be
    estimated as in a randomized experiment
  • No need for adjustments of any sort

103
You did it
  • You computed the causal risk ratio using inverse
    probability weighting
  • Right?

104
(No Transcript)
105
Inverse probability weights
  • Each individual in the population is weighted to
    create W individuals in the pseudo-population
  • The denominator of your W is (informally) the
    probability of having your observed treatment
    value given your L value
  • Not equal for all individuals with same L value
    because it depends on A value as well

106
Notational clarification
  • fA(a) or f(a) is the probability density function
    (pdf) of the random variable A evaluated at the
    value a
  • For discrete A f(a) PrAa
  • We need to represent the probability that each
    subject had his/her own exposure level A
  • PrAA is unclear notation at best
  • f(A) is the pdf evaluated at the random argument
    A (exactly what we mean)

107
IPW as a simulation
  • Weighting is the equivalent of simulating what
    would happen in the study population if everybody
    had received certain exposure level a
  • (Hmm That sounded vaguely familiar)
  • Individuals in the original population who
    received exposure level a are weighted to
    represent all individuals (regardless of exposure
    level) in the population
  • sample size of pseudo-population is equal to
    number of exposure levels times the size of
    original population

108
Too much of a coincidence?
  • IPW risk in the exposed was equal to the
    standardized risk in the exposed
  • IPW risk in the unexposed was equal to the
    standardized risk in the unexposed
  • Standardized risk ratio IPW risk ratio
  • Is this true in general?
  • Yes

109
Proof for dichotomous Y and discrete A and L
  • Just algebra
  • By definition (consistency)
  • By assumption
  • Just algebra

110
Standardization IPW
  • When study population is used as the standard
  • Each method compute a different component of the
    joint distribution
  • IPW fAL
  • Standardization fL, fYA,L
  • But the methods are algebraically equivalent
  • The standardized risk ratio is equal to the IPW
    risk ratio
  • Both are equal to the causal risk ratio
  • PrYa11 / PrYa01

111
Standardization IPW only in non parametric
settings
  • In real applications, sparse data dont allow to
    compute the components of the joint distribution
  • if L includes 20 variables or a continuous
    variable
  • in the presence of time-varying exposures
  • We would need to estimate these components using
    (semi)parametric models
  • IPW model to estimate fAL
  • Standardization models to estimate fL, fYA,L

112
Something elseThe Positivity condition
  • In each level of L in the population, there must
    be exposed and unexposed individuals
  • If f(l)gt0 then f(al)gt0 for all a
  • conditional probabilities must be positive
  • IPW/Standardization cannot be used when the
    positivity condition is not met

113
(No Transcript)
114
(No Transcript)
115
Generalization of standardizationto time-varying
exposures?
  • G-formula (Robins 1986)
  • THE method to estimate causal effects in non
    parametric settings
  • Independently discovered by computer
    scientists/artificial intelligence researchers
  • Problems
  • For complex longitudinal data and/or continuous
    covariates it requires huge amounts of data
  • Computationally intensive
  • No parameter for null hypothesis

116
Generalization of IPW to time-varying exposures?
  • IPW has a direct generalization
  • Time-varying weights W(t)
  • Informally, the inverse of the probability of
    having your observed treatment history through t
    given your L history through t
  • Weights can be estimated using models and then
    standard models can be used in the
    pseudo-population
  • Marginal structural models (Robins 1998)
  • More by Butch Tsiatis

117
Conclusions (I) Causal inference from
observational data is possible
  • under the assumptions of
  • Consistency
  • Conditional exchangeability
  • assumption of conditional randomization or of no
    unmeasured confounders
  • Positivity
  • using IPW/Standardization
  • Other assumptions are required for both
    observational and randomized data

118
Conclusions (II) Causal inference from
observational data is risky
  • Because the conditional exchangeability cannot be
    guaranteed or even tested
  • Expert knowledge can be used to enhance the
    plausibility of the assumption
  • measure as many relevant pre-exposure covariates
    as possible
  • Epidemiologists must be subject-matter experts or
    work with them
  • But one can only hope the assumption is
    approximately true
Write a Comment
User Comments (0)
About PowerShow.com