Title: Review of Probability and Statistics
1Instrumental Variables and Two Stage Least
Squares
- y b0 b1x1 b2x2 . . . bkxk u
- x1 p0 p1z p2x2 . . . pkxk v
2OUTLINE
- When we need Instrumental Variables?
- What is an Instrumental Variable?
- An Example
- IV Estimation in the Simple RM
- IV Estimator
- Inference
- Poor Instruments
- Some Applications
- IV Estimation in the Multiple RM
- 2SLS Estimator
- Inference
- An Application Institutions and Development
- Addressing Errors-in-Variables with IV Estimation
- Testing for Endogeneity
- Testing for Overidentifying Restrictions
- 2SLS with Heteroscedasticity
- 2SLS with Serial Correlation
31. When we need Instrumental Variables?
- Instrumental Variables (IV) estimation is used
when your model has endogenous xs - That is, whenever Cov(x,u) ? 0
- Thus, IV can be used to address the problem of
omitted variable bias - Additionally, IV can be used to solve the classic
errors-in-variables problem
42. What is an Instrumental Variable?
- In order for a variable, z, to serve as a valid
instrument for x, the following three conditions
must be true - (1) The instrument must be exogenous
- That is, Cov(z,u) 0
- (2) The instrument must be correlated with the
endogenous variable x - That is, Cov(z,x) ? 0
- (3) The instrument should not be a regressor in
the equation for y, or being perfectly correlated
with the regressors in that equation.
5What is an Instrumental Variable? (cont.)
- Conditions (2) and (3) are simple to verify. They
can be tested using the data. - For condition (2) Just testing H0 p1 0 in x
p0 p1z v - For condition (3) We can just look at the
R-square in the regression of z on all the
regressors other than x.
6What is an Instrumental Variable? (cont.)
- Condition (1), Cov(z,u) 0, is the key one. And
it can not be tested - To justify that condition (1) holds we need to
have a model with a clear interpretation of which
are the variables in the error term u. - We have to use economic theory and common sense
to decide if it makes sense to assume Cov(z,u) 0
73. An Example
- Problem Estimate effect of treatment (T) on
outcome (Y). i.e., estimate ?1 in - Yi ?0 ?1 Ti ui
- For simplicity, suppose
- Dichotomous treatment variable T1 if treated, 0
otherwise - Homogeneous treatment effect (?1)
- No other regressors.
83. An Example
- For concretness, supose that we are interested in
the effect of an investment subsidy on firms
capital investment. - Ti is the binary variable that indicates if a
firm has applied for and has been granted the
subsidy. - Yi represents the firms investment rate.
93. An Example
- OLS estimation yields the estimator
However, the key assumption for the consistency
of the OLS estimator (no correlation between T
and u) is unlikely to hold because treatment is
related to omitted factors u influencing
outcome.
10Four Solutions to this Problem
- Randomized Controlled Trial
- Natural Experiments Find similar observations
with different treatment for arbitrary reasons
(e.g. regulatory rules, law changes).Difference-
in-Difference estimates - Control for Observable Differences
- Attempt to condition on sufficient X's such that
E(Tu)0 - Then estimate directly by least squares
- (1) Y ?0 ?1 T X? u
- Instrumental Variables (IV)
- Suppose exists instrumental variable (Z) that
is - (A1) correlated with treatment E(Z T) ? 0
- (A2) Uncorrelated with residual E(Zu)0
-
11Example Simple IV
- For instance, in the investment subsidy example,
suppose that only a random number of firms can
apply for the subsidy. - Let Z be the dummy variable (0,1) that indicates
whether a firm can apply to obtain the subsidy or
not. - Because Z is purely random, it is not related to
u. - However, Z should be correlated with T because to
be granted the subsidy (T1) it is necessary to
be eligible (Z1).
12Example Simple IV
- Then, we have the following moment conditions
- E( ui ) 0 that implies E(Yi - ?0 - ?1
Ti) 0 - E( Zi ui ) 0 that implies E(Zi Yi -
?0 - ?1 Ti) 0 -
- Using the method of moments, we estimate ?0 and
?1 using the the sample moment conditions
associated with the previous population moment
conditions.
13Example Simple IV
- In this example, this estimator (IV) is
- (Difference in mean outcomes)/(difference in
treatment rate)
144. IV Estimation in the Simple RM
- For y b0 b1x u, and given our assumptions
- Cov(z,y) b1Cov(z,x) Cov(z,u),
- b1 Cov(z,y) / Cov(z,x)
- Therefore, given a random sample of x,y,z, by
the LLN a consistent estimator of b1 is (the IV
estimator)
15Inference with IV Estimation
- The homoskedasticity assumption in this case is
E(u2z) s2 Var(u) - As in the OLS case, given the asymptotic
variance, we can estimate the standard error
16Comparison of IV and OLS standard errors
- Standard error in IV case differs from OLS only
in the R2 from regressing x on z - Since R2 lt 1, IV standard errors are larger
- However, IV is consistent, while OLS is
inconsistent, when Cov(x,u) ? 0 - The stronger the correlation between z and x, the
smaller the IV standard errors
17Poor Instruments
- We have a poor instrument when z and x are weakly
correlated. - The problem of weak instruments is not just that
the variance of the IV estimator is much larger
than the variance of the OLS. - A more serious problem is that the IV estimator
can have a large asymptotic bias even if z and u
are only moderately correlated.
18Poor Instruments (cont.)
- We can compare the asymptotic bias in OLS and IV
- Prefer IV if Corr(z,u)/Corr(z,x) lt Corr(x,u),
that is if - Corr(z,x) gt Corr(z,u)/Corr(x,u)
19Some Applications IV.Estimating treatment
effects in AMI
- McClellan, M., B. McNeil and J. Newhouse, JAMA,
1994. - "Does More Intensive Treatment of Acute
Myocardial Infarction Reduce Mortality? - ? Medicare claims data, elderly with heart
attack (AMI), 1987-91 - ? Treatment Cardiac Catheterization (marker for
aggressive care) - ? Outcome Survival to 1 day, 30 days, 90 days,
etc. - ? Instrument Is nearest hospital a
catheterization hospital? - Differential Distance
- (distance to nearest cath) - (distance to
nearest non-cath) - based on zipcode of residence, zip code of
hospital
20Poor Instruments (cont.)
- Suppose that Corr(z,x)0.10 (which in fact is
larger than in many applications). - Then, the IV estimator has a smaller bias than
the OLS estimator only if Corr(z,u) is at least
10 times smaller than Corr(x,u). - Suppose that Corr(x,u)0.10 and that
Corr(x,u)0.01. - Then, the IV estimator has a smaller bias than
the OLS estimator only if Corr(z,x)gt0.10.
21Is Differential Distance a Good Instrument?
- Correlated with treatment (Cath)? Yes. ?
26.2 get Cath if nearest hospital is Cath
hospital ? 19.5 get Cath if nearest hospital
is not Cath hospital - 2. Uncorrelated with unobserved patient severity?
Never sure! But unrelated to observable patient
severity in claims
22Major Findings of McClellan et al.
- Least squares dramatically overstates treatment
effect, because Cath associated with fewer risk
factors. - ? 1-year mortality is 30 lower (17 vs. 47)
if Cath ? OLS estimate is 24, adjusting for
observable risk factors - 2. IV estimates suggest Cath associated with 5-10
percentage point reduction in mortality nearly
all in 1st day.
23Validating McClellan et al.
- Recent work replicates validates earlier work
using - 1. more comprehensive control variables
- 2. alternative instruments
- McClellan and Noguchi, 1998 (Tables 1 2 below)
- Geppert, McClellan and Staiger, 2001 (Table 4
below) - -- Data from Cooperative Cardiovascular Project
(CCP) - Chart data for appx. 180,000 AMI patients from
1994-95 - Linked Medicare claims data
- Â
- -- Treatments and outcomes of AMI in elderly as
in earlier work - Â
- -- Instruments
- (1) Differential distance
- (2)Â Â Â Variation in hospital Cath rate (gt4000
dummies) - Â
24Key Validation Questions
- Are severity measures unobserved in claims data
uncorrelated with instrument (differential
distance)? - Are OLS results closer to IV with more extensive
controls? - Are IV results robust to more extensive controls?
- Are IV results robust to alternative instruments?
25(No Transcript)
26(No Transcript)
27(No Transcript)
28Conclusions of Validation
- Measured individual covariates can be used to
assess bias of alternative methods for estimating
treatment effects with observational data. - Methods that attempt to adjust for observable
differences are quite sensitive to the use of
more detailed chart data, and yield biased
estimates of treatment effects in commonly
available datasets. - IV methods for evaluating AMI treatment are not
sensitive to the use of more detailed chart data,
and appear to have minimal bias.
29Growing number of applications of IV using a
variety of instruments
-  n   Geography as an instrument
- (distance, rivers, small area variation)
-  n   Legal/political institutions as an
instrument - (laws, election dynamics)
-  n   Administrative rules as an instrument
- (wage/staffing rules, reimbursement rules,
eligibility rules) -  n   Naturally occurring randomization
- (draft, birth timing, lottery, roommate
assignment, weather)
30Example Angrist Krueger
- Data US Census 5 PUMS
- Sample 329,509 men born 1930-39
- ln(earnings) Education? X? e
- Instruments (Quarter of Birth)(Year of
birth) (Quarter of Birth)(State of birth) - Instruments 178
- First-stage F 1.869(p-value) (0.000)
31Example Angrist Krueger (cont.)
- Estimate of ? 95 Confidence interval
- Â
- OLS 0.063 (0.062,0.063)
- Â
- 2SLS 0.081 (0.060,0.102)
- Â
- 2SLS 0.060 (0.031,0.089)
- with random
- instruments
- Â
- LIML 0.098 (0.068, 0.128)
- Â
- Valid Confidence (-0.015,0.240)
- Interval (Anderson-Rubin)
32Example Geppert, McClellan Staiger
- Use between-hospital variation in treatment
intensity (e.g. cath rate) as instrument to
estimate treatment effects - Equivalent to using gt4000 hospital dummies as
instruments - But instruments are weak 1st Stage F-statistic
is 10-25 ? 2SLS estimates have small bias (1/F)
towards OLS ? 2SLS SEs are too small (many
instruments, modest F) ? LIML SEs should be
okay - Using hierarchical structure, we develop
alternative GMM estimation procedure to correct
estimates SEs. (asymptotically equivalent to
LIML, but simpler) - Cath effects similar to McClellan et al., but
more precisely estimated
334. IV Estimation in the Multiple RM
- IV estimation can be extended to the multiple
regression case. - Call the model we are interested in estimating
the structural model. - Our problem is that one or more of the variables
are endogenous. - We need an instrument for each endogenous variable
34IV Estimation in the Multiple RM (cont.)
- Write the structural model as
- y1 b0 b1y2 b2z1 u1
- where y2 is endogenous and z1 is exogenous.
- Let z2 be the instrument, so Cov(z2,u1) 0 and
- y2 p0 p1z1 p2z2 v2
- where p2 ? 0
- This reduced form equation regresses the
endogenous variable on all exogenous ones
35Two Stage Least Squares (2SLS)
- Its possible to have multiple instruments
- Consider our original structural model, and let
- y2 p0 p1z1 p2z2 p3z3 v2
- Here were assuming that both z2 and z3 are valid
instruments they do not appear in the
structural model and are uncorrelated with the
structural error term, u1
362SLS Best Instrument
- Could use either z2 or z3 as an instrument
- The best instrument is a linear combination of
all of the exogenous variables, y2 p0 p1z1
p2z2 p3z3 - We can estimate y2 by regressing y2 on z1, z2
and z3 can call this the first stage - If then substitute y2 for y2 in the structural
model, get same coefficient as IV
37More on 2SLS
- While the coefficients are the same, the standard
errors from doing 2SLS by hand are incorrect, so
let Stata do it for you. - Method extends to multiple endogenous variables
need to be sure that we have at least as many
excluded exogenous variables (instruments) as
there are endogenous variables in the structural
equation
38Institutions as the fundamental cause of
long-run growth
- D. Acemoglu, S. Johnson, J. Robinson (2004) with
some additions - Theoretical Framework
- Economic Institutions and Income differences
- Natural Experiments
- The Colonial Origins of Comparative Development
An Empirical Investigation (2001) - Why do Institutions differ?
- Sources of inefficiencies
- Political implications
- Summary
39Theoretical Framework
40Economic Institutions and Income Differences
- Economic institutions (vs. geography and culture)
as fundamental cause of different patterns of
economic growth - Good economic institutions
- (to simplify and focus the discussion)
institutions that provide security of property
rights and relatively equal access to economic
resources to a broad cross-section of society
41Economic Institutions and Income Differences
Average Protection Against Risk of Expropriation
1985-95 and log GDP per capita 1995
42Economic Institutions and Income Differences
- Secure property rights cause prosperity?
- Problems with making such an inference!
- It could be reverse causation!
- It could be a problem of omitted variable bias
- What can we do?
- look for a natural experiment
- find a source of variation in economic
institutions that should have no effect on
economic outcomes
43Natural Experiment The Korean Experiment
- At the time of separation
- approximately the same GDP per capita
- Few geographic and cultural distinctions
- North followed the model of Soviet socialism and
the Chinese Revolution in abolishing private
property. Economic decision not mediated by the
market - South system of private property and market and
private incentives to develop the economy
GDP per capita in North and South Korea 1950-98
44Natural Experiment The Korean Experiment
- The only possible explanation for the radically
different economic experience - their very different INSTITUTIONS
- Necessity to look at a larger scale natural
experiment in institutional divergence!!!
45Natural Experiment The Colonial Experiment
- Europeans imposed different sets of institutions
in different parts of the globe - The Reversal of Fortune
- The nation states that coincide today with the
boundaries of prosperous empires (Incas, Aztecs)
in 1500 are among the poorer societies today! - The less developed civilisation in North America,
Australia are much richer than those in the land
of Incas and Aztecs -
-
-
log GDP per capita in 1995 and log Population
Density in 1500
46The Colonial Experiment
- Institutions hypothesis of the Reversal Fortune
- Densely-settled relative developed places
worse institutions - Sparsely-settled areas better institutions
- Why?
- Introduce/maintain extraction resources economy
in densely settled areas (where they could
exploit the population) - Protection their own rights in sparsely-settled
areas where the Europeans were the majority
47The colonial Origins of Comparative Development
An Empirical Investigation
- The disease environment not favourable for the
attractiveness of European settlement - Settlement mortality as exogenous variable for
the subsequent path of institutional development
to pin the causal effect of economic institutions
on prosperity - No impact of this variable on current income
levels only though economic institutions during
the colonial period - Measure Mortality rate faced by Europeans
(primarily soldiers, sailors and bishops)
48The colonial Origins of Comparative Development
An Empirical Investigation
- Hypothesis
- (potential) settler mortality ? settlements?
- early institutions? current institutions?
- ?current performance
- Empirical Results
- Institutions cause growth!!!
495. Addressing Errors-in-Variables with IV
- Remember the classical errors-in-variables
problem where we observe x1 instead of x1 - Where x1 x1 e1, and e1 is uncorrelated with
x1 and x2 -
- If there is a z, such that Corr(z,u) 0 and
Corr(z,x1) ? 0, then IV will remove the
measurement error bias
506. Testing for Endogeneity
- Since OLS is preferred to IV if we do not have an
endogeneity problem, then wed like to be able to
test for endogeneity - If we do not have endogeneity, both OLS and IV
are consistent - Idea of Hausman test is to see if the estimates
from OLS and IV are different.
51Testing for Endogeneity (cont)
- While its a good idea to see if IV and OLS have
different implications, its easier to use a
regression test for endogeneity - If y2 is endogenous, then v2 (from the reduced
form equation) and u1 from the structural model
will be correlated - The test is based on this observation
52Testing for Endogeneity (cont)
- Save the residuals from the first stage
- Include the residual in the structural equation
(which of course has y2 in it) - If the coefficient on the residual is
statistically different from zero, reject the
null of exogeneity - If multiple endogenous variables, jointly test
the residuals from each first stage
537. Testing Overidentifying Restrictions
- If there is just one instrument for our
endogenous variable, we cant test whether the
instrument is uncorrelated with the error - We say the model is just identified
- If we have multiple instruments, it is possible
to test the overidentifying restrictions to see
if some of the instruments are correlated with
the error
54Testing Overidentifying Restrictions
- Estimate the structural model using IV and obtain
the residuals - Regress the residuals on all the exogenous
variables and obtain the R2 to form nR2 - Under the null that all instruments are
uncorrelated with the error, LM cq2 where q is
the number of extra instruments
558. Testing for Heteroskedasticity
- When using 2SLS, we need a slight adjustment to
the Breusch-Pagan test - Get the residuals from the IV estimation
- Regress these residuals squared on all of the
exogenous variables in the model (including the
instruments) - Test for the joint significance
569. Testing for Serial Correlation
- When using 2SLS, we need a slight adjustment to
the test for serial correlation - Get the residuals from the IV estimation
- Re-estimate the structural model by 2SLS,
including the lagged residuals, and using the
same instruments as originally - Can do 2SLS on a quasi-differenced model, using
quasi-differenced instruments