Title: Econometric Analysis of Panel Data
1Econometric Analysis of Panel Data
- William Greene
- Department of Economics
- Stern School of Business
2Econometric Analysis of Panel Data
- 8. Instrumental Variables Estimation
3Structure and Regression
4Agenda
- Single equation instrumental variable estimation
- Exogeneity
- Instrumental Variable (IV) Estimation
- Two Stage Least Squares (2SLS)
- Generalized Method of Moments (GMM)
- Panel data
- Fixed effects
- Hausman and Taylors formulation
- Application
- Arellano/Bond/Bover framework
5Exogeneity
6The Effect of Education on LWAGE
7What Influences LWAGE?
8An Exogenous Influence
9An Experimental Treatment Effect
10The Measurement Error Problem
How general is this result?
11The Endogeneity Problem
- Regression y ßx e Changes in x are
associated with changes in y, but not e - dy/dx is measured by Cov(x,y), dx/dx is
measured by Var(x)dy/dx ß dx/dx de/dx ß
ß Cov(x,y)/Var(x) - If x is correlated with e, then changes in x are
associated with changes in e.There are now two
sources of change in y, direct change in x,
change in e associated with change in x - dy/dx measured in the data is not equal to ß
dx/dx ß.dy/dx ß dx/dx de/dx ß de/dx
12Instrumental Variables
- Instrumental variable associated with changes in
x, not with e - dy/dz ß dx/dz de /dz. The second term is 0.
- ß cov(y,z)/cov(x,z)This is the IV estimator
- Example Corporate earnings in year t
Earnings(t) ß RD(t) e(t)
RD(t) responds directly to Earnings(t) thus
e(t) A likely valid instrumental
variable would be RD(t-1)
which probably does not respond to current year
shocks to earnings.
13The First IV Study(Snow, J., On the Mode of
Communication of Cholera, 1855)
- London Cholera epidemic, ca 1853-4
- Cholera f(Water Purity,u)e.
- Effect of water purity on cholera?
- Purityf(cholera prone environment (poor, garbage
in streets, rodents, etc.). Regression does not
work. - Two London water companies
- Lambeth Southwark
-
- Main sewage discharge
Paul Grootendorst A Review of Instrumental
Variables Estimation of Treatment
Effectshttp//individual.utoronto.ca/grootendors
t/pdf/IV_Paper_Sept6_2007.pdf
14IV Estimation
- Choleraf(Purity,u)e
- Z water company
- Cov(Cholera,Z)dCov(Purity,Z)
- Z is randomly mixed in the population (two full
sets of pipes) and uncorrelated with behavioral
unobservables, u) - CholeraadPurityue
- Purity Meanrandom variation?u
- Cov(Cholera,Z) dCov(Purity,Z)
15Instrumental Variable Estimation
- One problem variable the last one
- yit ?1x1it ?2x2it ?KxKit eit
- EeitxKit ? 0. (0 for all others)
- There exists a variable zit such that
- ExKit x1it, x2it,, xK-1,it,zit g(x1it,
x2it,, xK-1,it,zit) - In the presence of the other variables, zit
explains xit - Eeit x1it, x2it,, xK-1,it,zit 0
- In the presence of the other variables, zit
and eit are uncorrelated. - A projection interpretation In the projection
- Xkt ?1x1it, ?2x2it ?k-1xK-1,it ?K
zit, - ?K ? 0.
16Least Squares
17The IV Estimator
18A Moment Based Estimator
19Consistency and Asymptotic Normality of the IV
Estimator
20Least Squares Revisited
21Comparing OLS and IV
22Cornwell and Rupert Data
Cornwell and Rupert Returns to Schooling Data,
595 Individuals, 7 YearsVariables in the file
are EXP work experience, EXPSQ EXP2WKS
weeks workedOCC occupation, 1 if blue collar,
IND 1 if manufacturing industrySOUTH 1 if
resides in southSMSA 1 if resides in a city
(SMSA)MS 1 if marriedFEM 1 if
femaleUNION 1 if wage set by unioin
contractED years of educationLWAGE log of
wage dependent variable in regressions These
data were analyzed in Cornwell, C. and Rupert,
P., "Efficient Estimation with Panel Data An
Empirical Comparison of Instrumental Variable
Estimators," Journal of Applied Econometrics, 3,
1988, pp. 149-155. Â See Baltagi, page 122 for
further analysis. Â The data were downloaded from
the website for Baltagi's text.
23Wage Equation with Endogenous Weeks
logWageß1 ß2 Exp ß3 ExpSq ß4OCC ß5 South
ß6 SMSA ß7 WKS e Weeks worked is believed
to be endogenous in this equation. We use the
Marital Status dummy variable MS as an exogenous
variable. Wooldridge Condition (5.3) CovMS, e
0 is assumed. Auxiliary regression For MS to
be a valid instrumental variable, In the
regression of WKS on 1,EXP,EXPSQ,OCC,South,SM
SA,MS, MS significantly explains WKS. A
projection interpretation In the
projection XitK ?1 x1it ?2 x2it ?K-1
xK-1,it ?K zit , ?K ? 0. (One normally
doesnt check the variables in this fashion.
24Auxiliary Projection
-------------------------------------------------
--- Ordinary least squares regression
LHSWKS Mean
46.81152 ----------------------------------
------------------ ---------------------------
--------------------------------------- Varia
ble Coefficient Standard Error
b/St.Er.PZgtz Mean of X ----------------
----------------------------------------------
---- Constant 45.4842872 .36908158
123.236 .0000 EXP .05354484
.03139904 1.705 .0881 19.8537815 EXPSQ
-.00169664 .00069138 -2.454
.0141 514.405042 OCC .01294854
.16266435 .080 .9366 .51116447 SOUTH
.38537223 .17645815 2.184
.0290 .29027611 SMSA .36777247
.17284574 2.128 .0334 .65378151 MS
.95530115 .20846241 4.583
.0000 .81440576
25Application IV for WKS in Rupert
-------------------------------------------------
--- Ordinary least squares regression
Residuals Sum of squares
678.5643 Fit R-squared
.2349075 Adjusted
R-squared .2338035 ------------------
---------------------------------- ------------
--------------------------------------------
Variable Coefficient Standard Error
b/St.Er.PZgtz --------------------------
------------------------------ Constant
6.07199231 .06252087 97.119 .0000 EXP
.04177020 .00247262 16.893
.0000 EXPSQ -.00073626 .546183D-04
-13.480 .0000 OCC -.27443035
.01285266 -21.352 .0000 SOUTH
-.14260124 .01394215 -10.228 .0000 SMSA
.13383636 .01358872 9.849
.0000 WKS .00529710 .00122315
4.331 .0000
26Application IV for wks in Rupert
-------------------------------------------------
--- LHSLWAGE Mean
6.676346 Standard deviation
.4615122 Residuals Sum of squares
13853.55 Standard
error of e 1.825317 Fit
R-squared -14.64641
Adjusted R-squared -14.66899
Not using OLS or no constant. Rsqd F may be lt
0. --------------------------------------------
-------- -------------------------------------
------------------- Variable Coefficient
Standard Error b/St.Er.PZgtz
--------------------------------------------
------------ Constant -9.97734299
3.59921463 -2.772 .0056 EXP
.01833440 .01233989 1.486 .1373 EXPSQ
-.799491D-04 .00028711 -.278
.7807 OCC -.28885529 .05816301
-4.966 .0000 SOUTH -.26279891
.06848831 -3.837 .0001 SMSA
.03616514 .06516665 .555 .5789 WKS
.35314170 .07796292 4.530
.0000 OLS----------------------------------------
-------------- WKS .00529710
.00122315 4.331 .0000
27Generalizing the IV Estimator-1
28Generalizing the IV Estimator - 2
29Generalizing the IV Estimator
30The Best Set of Instruments
31Two Stage Least Squares
322SLS Estimator
332SLS Algebra
34A General Result for IV
- We defined a class of IV estimators by the set of
variables - The minimum variance (most efficient) member in
this class is 2SLS (Brundy and Jorgenson(1971))
(rediscovered JW, 2000, p. 96-97)
35Inference with IV Estimators
36Robust estimation of VC
Predicted X
Actual X
372SLS vs. Robust Standard Errors
-------------------------------------------------
- Robust Standard Errors
----------------------------------------
------- Variable Coefficient Standard
Error b/St.Er. ------------------------------
----------------- B_1 45.4842872
4.02597121 11.298 B_2 .05354484
.01264923 4.233 B_3
-.00169664 .00029006 -5.849 B_4
.01294854 .05757179 .225 B_5
.38537223 .07065602 5.454 B_6
.36777247 .06472185 5.682
B_7 .95530115 .08681261 11.000
-----------------------------------------------
--- 2SLS Standard Errors
---------------------------------------
-------- Variable Coefficient Standard
Error b/St.Er. ------------------------------
----------------- B_1 45.4842872
.36908158 123.236 B_2 .05354484
.03139904 1.705 B_3
-.00169664 .00069138 -2.454 B_4
.01294854 .16266435 .080 B_5
.38537223 .17645815 2.184 B_6
.36777247 .17284574 2.128
B_7 .95530115 .20846241 4.583
38Weak Instruments
39Weak Instruments (cont.)
40Testing for Endogeneity(?)
41Regression Based Endogeneity Test
42Testing Endogeneity of WKS
(1) Regress WKS on 1,EXP,EXPSQ,OCC,SOUTH,SMSA,MS.
Uresidual, WKSHATprediction (2) Regress
LWAGE on 1,EXP,EXPSQ,OCC,SOUTH,SMSA,WKS, U or
WKSHAT ---------------------------------------
--------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
Mean of X ----------------------------------
-------------------------------- Constant
-9.97734299 .75652186 -13.188 .0000
EXP .01833440 .00259373 7.069
.0000 19.8537815 EXPSQ -.799491D-04
.603484D-04 -1.325 .1852 514.405042 OCC
-.28885529 .01222533 -23.628
.0000 .51116447 SOUTH -.26279891
.01439561 -18.255 .0000 .29027611 SMSA
.03616514 .01369743 2.640
.0083 .65378151 WKS .35314170
.01638709 21.550 .0000 46.8115246 U
-.34960141 .01642842 -21.280
.0000 -.341879D-14 ---------------------------
--------------------------------------- Varia
ble Coefficient Standard Error
b/St.Er.PZgtz Mean of X ----------------
----------------------------------------------
---- Constant -9.97734299 .75652186
-13.188 .0000 EXP .01833440
.00259373 7.069 .0000 19.8537815 EXPSQ
-.799491D-04 .603484D-04 -1.325
.1852 514.405042 OCC -.28885529
.01222533 -23.628 .0000 .51116447 SOUTH
-.26279891 .01439561 -18.255
.0000 .29027611 SMSA .03616514
.01369743 2.640 .0083 .65378151 WKS
.00354028 .00116459 3.040
.0024 46.8115246 WKSHAT .34960141
.01642842 21.280 .0000 46.8115246
43General Test for Endogeneity
44GMM Estimation Orthogonality Conditions
45GMM Estimation - 1
46GMM Estimation - 2
47IV Estimation
48An Optimal Weighting Matrix
49The GMM Estimator
50GMM Estimation
51Application - 2SLS
-------------------------------------------------
--- Two stage least squares regression
LHSLWAGE Mean
6.676346 Model size Parameters
7 Degrees of
freedom 4158 Instruments for
WKS Residuals Sum of squares
638.5818 are MS,UNION,ED
Standard error of e .4054646 Fit
R-squared .2279527
Not using OLS or no constant. Rsqd F may be lt
0. --------------------------------------------
-------- -------------------------------------
----------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
Mean of X ----------------------------------
-------------------------------- Constant
6.41895193 .29411045 21.825 .0000
EXP .04227684 .00251697 16.797
.0000 19.8537815 EXPSQ -.00075045
.560650D-04 -13.385 .0000 514.405042 OCC
-.27411851 .01290268 -21.245
.0000 .51116447 SOUTH -.14000277
.01415810 -9.889 .0000 .29027611 SMSA
.13594785 .01375050 9.887
.0000 .65378151 WKS -.00222272
.00634746 -.350 .7262 46.8115246
52Application - GMM
NAMELIST Xone,exp,expsq,occ,south,smsa,wks NAM
ELIST Zone,exp,expsq,occ,south,smsa,ms,union,e
d 2SLS LHS lwage RHS X INST
Z NLSQ FCN lwage-b1'x
LABELS b1,b2,b3,b4,b5,b6,b7
START b INST
Z PDS 0
53GMM Estimates
---------------------------------------------
Instrumental Variables (NL2SLS)
GMM Estimator - Lags 0 Periods
Value of the GMM criterion
e(b)tZ inv(ZtWZ) Zte(b) 537.3916
Sum of functions f(x,b) -221.9274
Estimation problem for 7 parameters.
Sample size is 4165 observations.
---------------------------------------------
----------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ---------------------
----------------------------------- B1
6.98356848 .27787583 25.132 .0000
B2 .04080996 .00259196 15.745
.0000 B3 -.00075277 .588278D-04
-12.796 .0000 B4 -.24671927
.01276555 -19.327 .0000 B5
-.14393303 .01461203 -9.850 .0000 B6
.14449428 .01358328 10.638
.0000 B7 -.01346160 .00601608
-2.238 .0252
54GMM vs. 2SLS
----------------------------------------------
-------------------- Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
----------------------- TWO STAGE LEAST
SQUARES Constant 6.41895193 .29411045
21.825 .0000 EXP .04227684
.00251697 16.797 .0000 19.8537815 EXPSQ
-.00075045 .560650D-04 -13.385
.0000 514.405042 OCC -.27411851
.01290268 -21.245 .0000 .51116447 SOUTH
-.14000277 .01415810 -9.889
.0000 .29027611 SMSA .13594785
.01375050 9.887 .0000 .65378151 WKS
-.00222272 .00634746 -.350
.7262 46.8115246 GENERALIZED METHOD OF
MOMENTS WITH HETEROSCEDASTICITY B1
6.98356848 .27787583 25.132 .0000 B2
.04080996 .00259196 15.745
.0000 B3 -.00075277 .588278D-04
-12.796 .0000 B4 -.24671927
.01276555 -19.327 .0000 B5
-.14393303 .01461203 -9.850 .0000 B6
.14449428 .01358328 10.638
.0000 B7 -.01346160 .00601608
-2.238 .0252
55Testing the Overidentifying Restrictions
56GMM Estimation
57Inference About the Parameters
58Specification Test Based on the Criterion
59Extending the Form of the GMM Estimator to
Nonlinear Models
60A Nonlinear Conditional Mean
61Nonlinear Regression/GMM
NAMELIST Xone,exp,expsq,occ,south,smsa,wks NAM
ELIST Zone,exp,expsq,occ,south,smsa,ms,union,e
d ? Get initial values to use for optimal
weighting matrixNLSQ LHS lwage
fcnexp(b1'x)labelsb
1,b2,b3,b4,b5,b6,b7
INST Z start7_0 ? GMM using previous
estimates to compute weighting matrix NLSQ (GMM)
FCN lwage-exp(b1'x)
LABELS b1,b2,b3,b4,b5,b6,b7
START b INST Z
PDS 0
62Nonlinear Wage Equation Estimates
-------------------------------------------------
--- Instrumental Variables (NL2SLS)
Residuals Sum of squares
690.1236 ----------------------------------
------------------ ---------------------------
----------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
--------------------------------------------
------------ B1 1.87846422
.04366585 43.019 .0000 B2
.00640621 .00038344 16.707 .0000 B3
-.00011391 .851508D-05 -13.378
.0000 B4 -.04106490 .00193784
-21.191 .0000 B5 -.02095476
.00215243 -9.735 .0000 B6
.02065632 .00208495 9.907 .0000 B7
-.00075796 .00094230 -.804
.4212 -------------------------------------------
-- Instrumental Variables (NL2SLS)
Value of the GMM criterion
e(b)tZ inv(ZtWZ) Zte(b) 530.2036
---------------------------------------------
----------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ---------------------
----------------------------------- B1
1.95181383 .04066956 47.992 .0000
B2 .00612089 .00039476 15.506
.0000 B3 -.00011276 .894520D-05
-12.606 .0000 B4 -.03671530
.00190644 -19.259 .0000 B5
-.02133895 .00221339 -9.641 .0000 B6
.02166118 .00205502 10.541
.0000 B7 -.00218988 .00088200
-2.483 .0130 Value of the GMM criterion for
the linear model (Basis for a test?) e(b)tZ
inv(ZtWZ) Zte(b) 537.3916
63IV for Panel Data