Estimating Causal Effects with Experimental Data - PowerPoint PPT Presentation

About This Presentation
Title:

Estimating Causal Effects with Experimental Data

Description:

Title: Estimating Causal Effects with Experimental Data Author: Alan Manning Last modified by: stepan Created Date: 12/24/2005 2:23:05 PM Document presentation format – PowerPoint PPT presentation

Number of Views:144
Avg rating:3.0/5.0
Slides: 45
Provided by: AlanM159
Category:

less

Transcript and Presenter's Notes

Title: Estimating Causal Effects with Experimental Data


1
Estimating Causal Effects with Experimental Data
2
Some Basic Terminology
  • Start with example where X is binary (though
    simple to generalize)
  • X0 is control group
  • X1 is treatment group
  • Causal effect sometimes called treatment effect
  • Randomization implies everyone has same
    probability of treatment

3
Why is Randomization Good?
  • If X allocated at random then know that X is
    independent of all pre-treatment variables in
    whole wide world
  • an amazing claim but true.
  • Implies there cannot be a problem of omitted
    variables, reverse causality etc
  • On average, only reason for difference between
    treatment and control group is different receipt
    of treatment

4
Why is this useful?An Example Racial
Discrimination
  • Black men earn less than white men in US
  • LOGWAGE Coef. Std. Err. t
  • ------------------------------------------
  • BLACK -.1673813 .0066708 -25.09
  • NO_HS -.2138331 .0077192 -27.70
  • SOMECOLL .1104148 .0049139 22.47
  • COLLEGE .4660205 .0048839 95.42
  • AGE .0704488 .0008552 82.38
  • AGESQUARED -.0007227 .0000101 -71.41
  • _cons 1.088116 .0172715 63.00
  • Could be discrimination or other factors
    unobserved by the researcher but observed by the
    employer?
  • hard to fully resolve with non-experimental data

5
An Experimental Design
  • Bertrand/Mullainathan Are Emily and Greg More
    Employable Than Lakisha and Jamal, American
    Economic Review, 2004
  • Create fake CVs and send replies to job adverts
  • Allocate names at random to CVs some given
    black-sounding names, others white-sounding

6
  • Outcome variable is call-back rates
  • Interpretation not direct measure of racial
    discrimination, just effect of having a
    black-sounding name may have other
    connotations.
  • But name uncorrelated by construction with other
    material on CV

7
The Treatment Effect
  • Want estimate of

8
Estimating Treatment Effects the Statistics
Course Approach
  • Take mean of outcome variable in treatment group
  • Take mean of outcome variable in control group
  • Take difference between the two
  • No problems but
  • Does not generalize to where X is not binary
  • Does not directly compute standard errors

9
Estimating Treatment Effects A Regression
Approach
  • Run regression
  • yiß0ß1Xiei
  • Proposition 2.2 The OLS estimator of ß1 is an
    unbiased estimator of the causal effect of X on
    y
  • Proof Many ways to prove this but simplest way
    is perhaps
  • Proposition 1.1 says OLS estimates E(yX)
  • E(yX0) ß0 so OLS estimate of intercept is
    consistent estimate of E(yX0)
  • E(yX1) ß0ß1 so ß1 is consistent estimate of
    E(yX1) -E(yX0)
  • Hence can read off estimate of treatment effect
    from coefficient on X
  • Approach easily generalizes to where X is not
    binary
  • Also gives estimate of standard error

10
Computing Standard Errors
  • Unless told otherwise regression package will
    compute standard errors assuming errors are
    homoskedastic i.e.
  • Even if only interested in effect of treatment on
    mean X may affect other aspects of distribution
    e.g. variance
  • This will cause heteroskedasticity
  • Heteroskedasticity does not make OLS regression
    coefficients inconsistent but does make OLS
    standard errors inconsistent

11
Robust Standard Errors
  • Also called
  • Huber standard errors
  • White standard errors
  • Heteroskedastic-consistent standard errors
  • Simple to use in practice e.g. in STATA
  • . reg y x, robust
  • Statistics course approach
  • Get variance of estimate of mean of treatment and
    control group
  • Sum to give estimate of variance of difference in
    means

12
Bertrand/MullainathanBasic Results
13
Summary So Far
  • Econometrics very easy if all data comes from
    randomized controlled experiment
  • Just need to collect data on treatment/control
    and outcome variables
  • Just need to compare means of outcomes of
    treatment and control groups
  • Is data on other variables of any use at all?
  • Not necessary but useful

14
Including Other Regressors
  • Can get consistent estimate of treatment effect
    without worrying about other variables
  • Reason is that randomization ensures no problem
    of omitted variables bias
  • But there are reasons to include other
    regressors
  • Improved efficiency
  • Check for randomization
  • Improve randomization
  • Control for conditional randomization
  • Heterogeneity in treatment effects

15
The Uses of Other Regressors I Improved
Efficiency
  • Dont just want consistent estimate of causal
    effect also want low standard error (or high
    precision or efficiency).
  • Standard formula for standard error of OLS
    estimate of ß is s2/Var(X)
  • s2 comes from variance of residual in regression
    (1-R2) Var(y)
  • Include more variables and R2 rises formal
    proof (Proposition 2.4) a bit more complicated
    but this is basic idea.

16
The Uses of Other Regressors II Check for
Randomization
  • Randomization can go wrong
  • Poor implementation of research design
  • Bad luck
  • If randomization done well then W should be
    independent of X this is testable
  • Test for differences in W in treatment/control
    groups
  • Probit model for X on W

17
The Uses of Other Regressors IIIImprove
Randomization
  • Can also use W at stage of assigning treatment
  • Can guarantee that in your sample X and W are
    independent instead of it being just
    probabiliistic
  • This is what Bertrand/Mullainathan do when
    assigning names to CVs

18
The Uses of Other Regressors IVAdjust for
Conditional Randomization
  • This is case where must include W to get
    consistent estimates of treatment effects
  • Conditional randomization is where probability of
    treatment is different for people with different
    values of W, but random conditional on W
  • Why have conditional randomization?
  • May have no choice
  • May want to do it (c.f. stratification)

19
An Example Project STAR
  • Allocation of students to classes is random
    within schools
  • But small number of classes per school
  • This leads to following relationship between
    probability of treatment and number of kids in
    school

20
Controlling for Conditional Randomization
  • X can know be correlated with W
  • But, conditional on W, X independent of other
    factors
  • But must get functional form of relationship
    between y and W correct matching procedures
  • This is not the case with (unconditional)
    randomization see class exercize

21
Heterogeneity in Treatment Effects
  • So far have assumed causal (treatment) effect the
    same for everyone
  • No good reason to believe this
  • Start with case of no other regressors
  • yiß0ß1iXiei
  • Random assignment implies X independent of ß1i
  • Sometimes called random coefficients model

22
What treatment effect to estimate?
  • Would like to estimate causal effect for everyone
    this is not possible Hollands fundamental
    problem of statistical inference
  • Can only hope to estimate some average
  • Average treatment effect
  • Proposition 2.5 OLS estimates ATE

23
Observable Heterogeneity
  • Full outcomes notation
  • Outcome if in control group
  • y0i?0Wiu0i
  • Outcome if in treatment group
  • y1i?1Wiu1i
  • Treatment effect is (y1i-y0i) and can be written
    as
  • (y1i-y0i )(?1- ?0 )Wiu1i-u0i
  • Note treatment effect has observable and
    unobservable component
  • Can estimate as
  • Two separate equations
  • One single equation

24
Combining treatment and control groups into
single regression
  • We can write
  • Combining outcomes equations leads to
  • Regression includes W and interactions of W with
    X these are observable part of treatment effect
  • Note error likely to be heteroskedastic

25
Bertrand/Mullainathan
  • Different treatment effect for high and low
    quality CVs

26
Units of Measurement
  • Causal effect measured in units of experiment
    not very helpful
  • Often want to convert causal effects to more
    meaningful units e.g. in Project STAR what is
    effect of reducing class size by one child

27
Simple estimator of this would be
  • where S is class size
  • Takes the treatment effect on outcome variable
    and divides by treatment effect on class size
  • Not hard to compute but how to get standard
    error?

28
IV Can Do the Job
  • Cant run regression of y on S S influenced by
    factors other than treatment status
  • But X is
  • Correlated with S
  • Uncorrelated with unobserved stuff (because of
    randomization)
  • Hence X can be used as an instrument for S
  • IV estimator has form (just-identified case)

29
The Wald Estimator
  • This will give estimate of standard error of
    treatment effect
  • Where instrument is binary and no other
    regressors included the IV estimate of slope
    coefficient can be shown to be

30
Partial Compliance
  • So far
  • in control group implies no treatment
  • In treatment group implies get treatment
  • Often things are not as clean as this
  • Treatment is an opportunity
  • Close substitutes available to those in control
    group
  • Implementation not perfect e.g. pushy parents

31
An Example Moving to Opportunity
  • Designed to investigate the impact of living in
    bad neighbourhoods on outcomes
  • Gave some residents of public housing projects
    chance to move out
  • Two treatments
  • Voucher for private rental housing
  • Voucher for private rental housing restricted for
    use in good neighbourhoods
  • No-one forced to move so imperfect compliance
    60 and 40 did use it

32
Some Terminology
  • Z denotes whether in control or treatment group
    intention-to-treat
  • X denotes whether actually get treatment
  • With perfect compliance
  • Pr(X1Z1)1
  • Pr(X1Z0)0
  • With imperfect compliance
  • 1gtPr(X1Z1)gtPr(X1Z0)gt0

33
What Do We Want to Estimate?
  • Intention-to-Treat
  • ITTE(yZ1)-E(yZ0)
  • This can be estimated in usual way
  • Treatment Effect on Treated

34
Estimating TOT
  • Cant use simple regression of y on Z
  • But should recognize TOT as Wald estimator
  • Can estimated by regressing y on X using Z as
    instrument
  • Relationship between TOT and ITT

35
Most Important Results from MTO
  • No effects on adult economic outcomes
  • Improvements in adult mental health
  • Beneficial outcomes for teenage girls
  • Adverse outcomes for teenage boys

36
Sample results from MTO
  • TOT approximately twice the size of ITT
  • Consistent with 50 use of vouchers

37
IV with Heterogeneous Treatment Effects (More
Difficult)
  • If treatment effect same for everyone then TOT
    recovers this (obvious)
  • But what if treatment effect heterogeneous?
  • No simple answer to this question
  • Suppose model for treatment effect is

38
Proposition 2.6The IV estimate for the
heterogeneous treatment case is a consistent
estimate ofwherethe difference in the
probability of treatment for individual i when in
treatment and control group
39
Interpretation
  • This is weighted average of treatment effects
  • weights will vary with instrument contrast
    with heterogeneous treatment case
  • Some cases in which can interpret IV estimate as
    ATE

40
How will IV estimate differ from ATE
  • IV is ATE if no correlation between ß1i and pi
  • Previous formula says depends on covariance of
    ß1i and pi
  • In some situations can sign but not always
  • Example 1 no-one gets treatment in the absence
    of the programme so
  • If those who get treatment when in the treatment
    group are those with the highest returns then
  • IVgtATE

41
  • Example 2 treatment is voluntary for those in
    the control group but compulsory for those in the
    treatment group
  • This implies
  • If those who get treatment in control are those
    with highest returns then
  • IVltATE

42
Angrist/Imbens Monotonicity Assumption
  • Case where IV estimate is not ATE
  • Assume that everyone moved in same direction by
    treatment monotonicity assumption
  • Then can show that IV is average of treatment
    effect for those whose behaviour changed by being
    in treatment group
  • They call this the Local Average Treatment Effect
    (LATE)

43
Problems with Experiments
  • Expense
  • Ethical Issues
  • Threats to Internal Validity
  • Failure to follow experiment
  • Experimental effects (Hawthorne effects)
  • Threats to External Validity
  • Non-representative programme
  • Non-representative sample
  • Scale effects

44
Conclusions on Experiments
  • Are gold standard of empirical research
  • Are becoming more common
  • Not enough of them to keep us busy
  • Study of non-experimental data can deliver useful
    knowledge
  • Some issues similar, others different
Write a Comment
User Comments (0)
About PowerShow.com