Title: FINAL MEETING
1FINAL MEETING OTHER METHODS
2General conclusions on causal analyses
- Magic tool of ceteris paribus
- Regression is ceteris paribus by definition
- But the data need not to be they are just a
subsample of general populations and many other
things confound - Causal effects, i.e. cause and effect
- Propensity Score Matching
- Regression Discontinuity
- Fixed Effects
- Instrumental Variables
3If we cannot experiment..
Cross-sectional data
Panel data
IV
Propensity Score Matching DiD
Before After Estimators
Propensity Score Matching
Difference in Difference Estimators (DiD)
Regression Discontinuity Design
4Problems with causal inference
Confounding Influence (environment)
Treatment
Effect
5Instrumental Variables solution
Confounding Influence
Treatment
Outcome
Instrumental Variable(s)
6Fixed Effects Solution (DiD does pretty much the
same)
Fixed Influences
Confounding Influence
Treatment
Outcome
7Propensity Score Matching
Confounding Influence
Treatment
Treatment
Outcome
8Regression Discontinuity Design
Group that is key for this policy
Confounding Influence
Treatment
Effect
9A motivating story
- Today women in Poland have on average 1,7 kid
- About 50 years ago, women had 2,8 kids
- Todays women are 6 times more educated than 50
years ago will a drop from 2.8 to 1.7 be an
effect of this educational change? - Natural experiment in 1960 schooling obligation
was extended by one year (11 to 12 years). - THE SAME women born just before 1953 went to
primary and secondary schools a year shorter than
born after 1953 - THE SAME ?
- RD allows to compare fertility (with individual
characteristics) for women born around 1953
10Regression Discontinuity Design
- Idea
- Focus your analyses on a group for which treament
was random (or rather independent) - How to do it?
- Example weaker students have lower grades, but
are also frequently delayed to repeat
courses/years if we give them extra classes,
better students will outperform them anyway, so
how to test if extra classes help? - RDD will compare the performance of students just
above and just below threshold, so quite
similar ones - RDD will only work if people cannot prevent or
encourage treatment by relocating themselves
around threshold
11Regression Discontinuity Design
- Advantages
- Really marginal effect
- Causal, if RDD well applied
- Disadvantages
- Sample size largely limited
- Only local character of estimations
(marginal?average) - Problems
- How do we know how far away from threshold can we
go (bandwidth)? - How do we know if design is ok.?
12Regression Discontinuity Design
- Zastosowanie
- Trade off between narrow bandwidth (for
independence assumption) and wide bandwidth to
increase sample size - One can try to find it empirically ( fuzzy RD
design) - Y is the effect, p is treatment probability.
- is effect of probability just above cut-off
- - is effect of probability just below cut-off
13Regression Discontinuity Design
14Regression Discontinuity Design
15Regression Discontinuity Design
16How to do this in STATA?
- First download package net instal rd
- Second define your model
- rd out, treatment, in if in weight ,
options - Third there are some options
- mbw(numlist) multiplication of bandwidth in
percent (default "100 50 200" which means we
always do 50, 100 and 200) - z0(real) sets cutoff Z0 (treatment)
- ddens asks for extra estimation of
discontinuities in Z density - graph draws graphs weve seen automatically
17Sample results in STATA - data
18Output from STATA
19Output from STATA - graph
20Output from STATA fuzzy version
gen byte ranwincond(uniform()lt.1,1-win,win) rd
lne ranwin d, mbw(25(25)300) bdep ox
21One last thing ?
22A motivating story
23Some basics doubts of an empirical economist
- Compare similar to similar
- Keep statistical properties
- Understand bezond average x
- Understand (and be independent of) outliers
24Robust estimators
- First flavour of robust regression with robust
option - Helps if problem is not systematic
- Does not help if problem is the nature of the
process (e.g. heterogeneity) - Second flavour of robust nonparametric
estimators - Complex from mathematical point of view
- Takes longer to compute
- But veeeery elastic
- gt Koenker (and his followers)
25How to do this in STATA?
- Estimate at median
- qreg y in
- Estimate at any other percentile
- qreg y in, quantile(q) where q is your
percentile - Estimate differences between different
percentiles - iqreg y in, quantile(.25 .75) reps(100)
additionally may bootstrap
What is bootstrap for?
26Output from STATA
27Output from STATA
28Summarising all this crap
Confounding Influence (environment)
Treatment
???
Effect
29Problems
- Sample
- size
- heterogeneity
- Methods
- None is perfect
- Question important
- Nonparametric (kernel in PSM or QR) are robust,
robust is not a synonim for miraculous