Regression Discontinuity - PowerPoint PPT Presentation

About This Presentation

Title:

Regression Discontinuity

Description:

Union Elections. If employers want to unionize, NLRB ... House Elections. Incumbency ... election, even by just one vote, you get a huge advantage in the ... – PowerPoint PPT presentation

Number of Views:159

Avg rating:3.0/5.0

Slides: 31

Provided by: garretchr

Learn more at: https://www.ocf.berkeley.edu

Category:

more less

Transcript and Presenter's Notes

Title: Regression Discontinuity

1
Regression Discontinuity

10/13/08

2
What is R.D.?

Regression--the econometric/statistical tool
social scientists use to analyze multivariate
correlations

Where Y is some sort of dependent variable,
alphas a constant, the Xs are a bunch of
independent variables, the betas are
coefficients, and the e is the error term.
3
Discontinuity

Some sort of arbitrary jump/change thanks to a
quirk in law or nature.
Were interested in the ones that make very
similar people get very dissimilar results.

4
Discontinuity Examples

PSAT/NMSQT
Basically the top 16,000 test-takers get a
scholarship.
A small difference in test score can means a
discontinuous jump in scholarship amount.

5
Discontinuity Examples

School Class Size
Maimonides Rule--No more than 40 kids in a class
in Israel.
40 kids in school means 40 kids per class. 41
kids means two classes with 20 and 21.
(Angrist Lavy, QJE 1999)

6
Discontinuity Examples

Union Elections
If employers want to unionize, NLRB holds
election. 50 means the employer doesnt have to
recognize the union, and 50 1 means the
employer is required to bargain in good faith
with the union.
(DiNardo Lee, QJE 2004)

7
Discontinuity Examples

U.S. House Elections
Incumbency advantage. If youre first past the
pole in the previous election, even by just one
vote, you get a huge advantage in the next
election.
(David Lee, Journal of Econometrics 2007)

8
Discontinuity Examples

Air Pollution and Home Values
The Clean Air Acts National Ambient Air Quality
Standards say if the geometric mean concentration
of 5 pollutant particulates is 75 micrograms per
cubic meter or greater, county is classified as
non-attainment and are subject to much more
stringent regulation.
(Ken Chay, Michael Greenstone, JPE 2005)

9
Combine the R and the D

Run a regression based on a situation where
youve got a discontinuity.
Treat above-the-cutoff and below-the-cutoff like
the treatment and control groups from a
randomization.

10
Why are we doing this?

Why do we have to look for quirks like this?
Cant we just control for whatever we want using
OLS or some other line-fitting tool?
Just get a bunch of peoples salaries and PSAT
scores. PSATs are X, income is Y, run a
regression in SPSS/Stata, or heck, even Excel,
and we have causal inference, right? Higher test
scores cause people to earn more later in life.

11
No.

The statistical methods we use are based on lot
of assumptions. Importantly, the error terms
(which is really full of things we cant measure,
the unobservables) are supposed to be
uncorrelated with the Xs and normally
distributed.
In reality, those conditions probably hasnt been
met in any of the previous situations.
For example, class size is probably correlated
with some type of neighborhood quality.
Please turn to your neighbor and discuss what is
probably wrong with each of the previous 5
examples (PSAT, class size, union elections,
house elections, air pollution)

12
No.

The statistical methods we use are based on lot
of assumptions. Importantly, the error terms
(which is really full of things we cant measure,
the unobservables) are supposed to be
uncorrelated with the Xs and normally
distributed.
In reality, those conditions probably hasnt been
met in any of the previous situations.
Higher PSAT kids might have higher ability.
Crowded classrooms might be in poorer schools.
Unionized workers might work for certain types of
firms.
Incumbent politicians might be better. They won
before, didnt they?
Pollution might be correlated to economic growth,
which could increase home values.

13
Controlling for everything?

Focus on the Israeli schools for a second.
We can try and control for neighborhood poverty
level.
Does that solve the problem?
No.
If neighborhood poverty level is correlated with
the X of interest (class size) why would you
think its safe to assume that the unobservables
arent correlated? Have you really magically
controlled for every single thing thats
correlated with the X of interest? Probably not.
So lets find a bandwidth in which these things
are uncorrelated.

14
A Bandwidth of Randomness

Test scores arent random, and neither is class
size, nor air pollution.
But is a kid in the 94.9th percentile really that
different from the 95th percentile kid?
Is a school with 40 kids that different from a
school with 41?
Right around the cutoff, theres a good chance
things are random.

15
No Sorting - Observables

But dont take my word for it. Look at the
averages of the observables in your below cutoff
group, and the averages of the observables in the
above cutoff group. Are they the same?
Hopefully, but maybe not.
Do people know about this cutoff? Are they doing
some endogenous sorting? When deciding where to
live, did good moms look for schools where their
kids would be the 41st kid? Did certain types of
polluters look for counties where theyd be
below the cutoff?
These things can be checked to some degree--look
at the average observables above and below the
cutoff.

16
No Sorting - Clumping

In addition to checking the observables on either
side of the cutoff, we should check the density
of the distribution. Is it unusually low/high
right around the cutoff?
If theres some abnormally large portion of
people right around the cutoff, its quite
possible that you dont have random assignment.

17
No Sorting - Clumping

Dude, youre totally cheating. Please stop.
Emily Conover Adriana Camacho Manipulation of
Social Program Eligibility

18
GSP--Multiple Analyses

Incentives to Learn, Ted Miguel, Michael
Kremer, Rebecca Thornton
Girls Scholarship Program, Busia Kenya.
Randomize holding a scholarship competition
across schools in Busia and Teso districts.
Treatment If a girl finishes in the top 15 in
her district on the end-of-year exam, she wins a
two-year scholarship.
Randomization Analysis Does attending a school
with the competition make you work harder/improve
schooling outcomes?
RD Analysis Does winning the award improve
schooling outcomes?

19
P-900 in Chile

The Central Role of Noise in Evaluating
Interventions That Use Test Scores to Rank
Schools Kenneth Y. Chay, Patrick J. Mcewan,
Miguel Urquiola, AER 2005
Mean Reversion Sophomore Slump, SI Cover Curse,
Heisman Trophy Curse, Madden curse, and in the
opposite direction.

20
THIS IS THE MOST AMAZING THING EVER!

HOLY CRAP! Look at the educational outcomes of
treatment schools in 1990, compared to those same
schools in 1988, before the program. AMAZING!
FANTABULOUS!

21
Oh, wait.

Hmm. Thats kind of disappointing.

22
So how do we actually do this?

Draw two pretty pictures
Eligibility criterion (test score, income, or
whatever) vs. Program Enrollment
Eligibility criterion vs. Outcome

23
So how do we actually do this?

2. Run a simple regression.
(Yes, this is basically all we ever do, and the
stats programs we use can run the calculation in
almost any situation, but before we do it, its
necessary to make sure the situation is
appropriate and draw the graphs so that we can
have confidence that our estimates are actually
causal.)
Outcome as a function of test score (or
whatever), with a binary (1 if yes, 0 if no)
variable for program enrollment.

24
Is it really that simple?

Dont be silly.
You could totally have a situation where the
outcome is some sort of quadratic or cubic or nth
polynomial function of the test score. Try
controlling for that. This is going to depend on
the situation and is somewhat arbitrary.

25
Wait, somewhat arbitrary?

Yeh, lame, I know. Arbitrarys what were trying
to avoid. But two things arent univerally
clear
1. How wide a bandwidth around the cutoff are we
looking at?
Were really only confident in our estimate for
people that are close to the cutoff. This is a
LOCAL AVERAGE TREATMENT EFFECT. We can
confidently say that a school right around the
cutoff would improve average test scores by X if
they received the treatment, but were not so
confident that already awesome schools would get
the same benefit.

26
Wait, somewhat arbitrary?

2. Without the program, what shaped function
would there be naturally?
What sort of function do we throw in to control
for the fact that even if there was no National
Merit Semifinalist scholarship, smarter kids are
likely to earn more later in life?
The solution SHOW YOUR WORK

27
Youre Such a Phony.