Estimating Causal Effects with Experimental Data - PowerPoint PPT Presentation

About This Presentation

Title:

Estimating Causal Effects with Experimental Data

Description:

Title: Estimating Causal Effects with Experimental Data Author: Alan Manning Last modified by: stepan Created Date: 12/24/2005 2:23:05 PM Document presentation format – PowerPoint PPT presentation

Number of Views:144

Avg rating:3.0/5.0

Slides: 45

Provided by: AlanM159

Category:

more less

Transcript and Presenter's Notes

Title: Estimating Causal Effects with Experimental Data

1
Estimating Causal Effects with Experimental Data
2
Some Basic Terminology

Start with example where X is binary (though
simple to generalize)
X0 is control group
X1 is treatment group
Causal effect sometimes called treatment effect
Randomization implies everyone has same
probability of treatment

3
Why is Randomization Good?

If X allocated at random then know that X is
independent of all pre-treatment variables in
whole wide world
an amazing claim but true.
Implies there cannot be a problem of omitted
variables, reverse causality etc
On average, only reason for difference between
treatment and control group is different receipt
of treatment

4
Why is this useful?An Example Racial
Discrimination

Black men earn less than white men in US
LOGWAGE Coef. Std. Err. t
------------------------------------------
BLACK -.1673813 .0066708 -25.09
NO_HS -.2138331 .0077192 -27.70
SOMECOLL .1104148 .0049139 22.47
COLLEGE .4660205 .0048839 95.42
AGE .0704488 .0008552 82.38
AGESQUARED -.0007227 .0000101 -71.41
_cons 1.088116 .0172715 63.00
Could be discrimination or other factors
unobserved by the researcher but observed by the
employer?
hard to fully resolve with non-experimental data

5
An Experimental Design

Bertrand/Mullainathan Are Emily and Greg More
Employable Than Lakisha and Jamal, American
Economic Review, 2004
Create fake CVs and send replies to job adverts
Allocate names at random to CVs some given
black-sounding names, others white-sounding

Outcome variable is call-back rates
Interpretation not direct measure of racial
discrimination, just effect of having a
black-sounding name may have other
connotations.
But name uncorrelated by construction with other
material on CV

7
The Treatment Effect

Want estimate of

8
Estimating Treatment Effects the Statistics
Course Approach

Take mean of outcome variable in treatment group
Take mean of outcome variable in control group
Take difference between the two
No problems but
Does not generalize to where X is not binary
Does not directly compute standard errors

9
Estimating Treatment Effects A Regression
Approach

Run regression
yiß0ß1Xiei
Proposition 2.2 The OLS estimator of ß1 is an
unbiased estimator of the causal effect of X on
y
Proof Many ways to prove this but simplest way
is perhaps
Proposition 1.1 says OLS estimates E(yX)
E(yX0) ß0 so OLS estimate of intercept is
consistent estimate of E(yX0)
E(yX1) ß0ß1 so ß1 is consistent estimate of
E(yX1) -E(yX0)
Hence can read off estimate of treatment effect
from coefficient on X
Approach easily generalizes to where X is not
binary
Also gives estimate of standard error

10
Computing Standard Errors

Unless told otherwise regression package will
compute standard errors assuming errors are
homoskedastic i.e.
Even if only interested in effect of treatment on
mean X may affect other aspects of distribution
e.g. variance
This will cause heteroskedasticity
Heteroskedasticity does not make OLS regression
coefficients inconsistent but does make OLS
standard errors inconsistent

11
Robust Standard Errors

Also called
Huber standard errors
White standard errors
Heteroskedastic-consistent standard errors
Simple to use in practice e.g. in STATA
. reg y x, robust
Statistics course approach
Get variance of estimate of mean of treatment and
control group
Sum to give estimate of variance of difference in
means

12
Bertrand/MullainathanBasic Results
13
Summary So Far

Econometrics very easy if all data comes from
randomized controlled experiment
Just need to collect data on treatment/control
and outcome variables
Just need to compare means of outcomes of
treatment and control groups
Is data on other variables of any use at all?
Not necessary but useful

14
Including Other Regressors

Can get consistent estimate of treatment effect
without worrying about other variables
Reason is that randomization ensures no problem
of omitted variables bias
But there are reasons to include other
regressors
Improved efficiency
Check for randomization
Improve randomization
Control for conditional randomization
Heterogeneity in treatment effects

15
The Uses of Other Regressors I Improved
Efficiency

Dont just want consistent estimate of causal
effect also want low standard error (or high
precision or efficiency).
Standard formula for standard error of OLS
estimate of ß is s2/Var(X)
s2 comes from variance of residual in regression
(1-R2) Var(y)
Include more variables and R2 rises formal
proof (Proposition 2.4) a bit more complicated
but this is basic idea.

16
The Uses of Other Regressors II Check for
Randomization

Randomization can go wrong
Poor implementation of research design
Bad luck
If randomization done well then W should be
independent of X this is testable
Test for differences in W in treatment/control
groups
Probit model for X on W

17
The Uses of Other Regressors IIIImprove
Randomization

Can also use W at stage of assigning treatment
Can guarantee that in your sample X and W are
independent instead of it being just
probabiliistic
This is what Bertrand/Mullainathan do when
assigning names to CVs

18
The Uses of Other Regressors IVAdjust for
Conditional Randomization

This is case where must include W to get
consistent estimates of treatment effects
Conditional randomization is where probability of
treatment is different for people with different
values of W, but random conditional on W
Why have conditional randomization?
May have no choice
May want to do it (c.f. stratification)

19
An Example Project STAR

Allocation of students to classes is random
within schools
But small number of classes per school
This leads to following relationship between
probability of treatment and number of kids in
school

20
Controlling for Conditional Randomization

X can know be correlated with W
But, conditional on W, X independent of other
factors
But must get functional form of relationship
between y and W correct matching procedures
This is not the case with (unconditional)
randomization see class exercize

21
Heterogeneity in Treatment Effects

So far have assumed causal (treatment) effect the
same for everyone
No good reason to believe this
Start with case of no other regressors
yiß0ß1iXiei
Random assignment implies X independent of ß1i
Sometimes called random coefficients model

22
What treatment effect to estimate?

Would like to estimate causal effect for everyone
this is not possible Hollands fundamental
problem of statistical inference
Can only hope to estimate some average
Average treatment effect
Proposition 2.5 OLS estimates ATE

23
Observable Heterogeneity

Full outcomes notation
Outcome if in control group
y0i?0Wiu0i
Outcome if in treatment group
y1i?1Wiu1i
Treatment effect is (y1i-y0i) and can be written
as
(y1i-y0i )(?1- ?0 )Wiu1i-u0i
Note treatment effect has observable and
unobservable component
Can estimate as
Two separate equations
One single equation

24
Combining treatment and control groups into
single regression

We can write
Combining outcomes equations leads to
Regression includes W and interactions of W with
X these are observable part of treatment effect
Note error likely to be heteroskedastic

25
Bertrand/Mullainathan

Different treatment effect for high and low
quality CVs

26
Units of Measurement

Causal effect measured in units of experiment
not very helpful
Often want to convert causal effects to more
meaningful units e.g. in Project STAR what is
effect of reducing class size by one child

27
Simple estimator of this would be

where S is class size
Takes the treatment effect on outcome variable
and divides by treatment effect on class size
Not hard to compute but how to get standard
error?

28
IV Can Do the Job

Cant run regression of y on S S influenced by
factors other than treatment status
But X is
Correlated with S
Uncorrelated with unobserved stuff (because of
randomization)
Hence X can be used as an instrument for S
IV estimator has form (just-identified case)

29
The Wald Estimator

This will give estimate of standard error of
treatment effect
Where instrument is binary and no other
regressors included the IV estimate of slope
coefficient can be shown to be

30
Partial Compliance

So far
in control group implies no treatment
In treatment group implies get treatment
Often things are not as clean as this
Treatment is an opportunity
Close substitutes available to those in control
group
Implementation not perfect e.g. pushy parents

31
An Example Moving to Opportunity

Designed to investigate the impact of living in
bad neighbourhoods on outcomes
Gave some residents of public housing projects
chance to move out
Two treatments
Voucher for private rental housing
Voucher for private rental housing restricted for
use in good neighbourhoods
No-one forced to move so imperfect compliance
60 and 40 did use it

32
Some Terminology

Z denotes whether in control or treatment group
intention-to-treat
X denotes whether actually get treatment
With perfect compliance
Pr(X1Z1)1
Pr(X1Z0)0
With imperfect compliance
1gtPr(X1Z1)gtPr(X1Z0)gt0

33
What Do We Want to Estimate?

Intention-to-Treat
ITTE(yZ1)-E(yZ0)
This can be estimated in usual way
Treatment Effect on Treated

34
Estimating TOT

Cant use simple regression of y on Z
But should recognize TOT as Wald estimator
Can estimated by regressing y on X using Z as
instrument
Relationship between TOT and ITT

35
Most Important Results from MTO

No effects on adult economic outcomes
Improvements in adult mental health
Beneficial outcomes for teenage girls
Adverse outcomes for teenage boys

36
Sample results from MTO

TOT approximately twice the size of ITT
Consistent with 50 use of vouchers

37
IV with Heterogeneous Treatment Effects (More
Difficult)

If treatment effect same for everyone then TOT
recovers this (obvious)
But what if treatment effect heterogeneous?
No simple answer to this question
Suppose model for treatment effect is

38
Proposition 2.6The IV estimate for the
heterogeneous treatment case is a consistent
estimate ofwherethe difference in the
probability of treatment for individual i when in
treatment and control group
39
Interpretation

This is weighted average of treatment effects
weights will vary with instrument contrast
with heterogeneous treatment case
Some cases in which can interpret IV estimate as
ATE

40
How will IV estimate differ from ATE

IV is ATE if no correlation between ß1i and pi
Previous formula says depends on covariance of
ß1i and pi
In some situations can sign but not always
Example 1 no-one gets treatment in the absence
of the programme so
If those who get treatment when in the treatment
group are those with the highest returns then
IVgtATE

Example 2 treatment is voluntary for those in
the control group but compulsory for those in the
treatment group
This implies
If those who get treatment in control are those
with highest returns then
IVltATE

42
Angrist/Imbens Monotonicity Assumption

Case where IV estimate is not ATE
Assume that everyone moved in same direction by
treatment monotonicity assumption
Then can show that IV is average of treatment
effect for those whose behaviour changed by being
in treatment group
They call this the Local Average Treatment Effect
(LATE)

43
Problems with Experiments