Sao Polanec Faculty of Economics University of Ljubljana saso'polanecef'unilj'si - PowerPoint PPT Presentation

1 / 91
About This Presentation
Title:

Sao Polanec Faculty of Economics University of Ljubljana saso'polanecef'unilj'si

Description:

Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 Years ... Application 1: Cornwell and Rupert Data (cont. ... Cornwell and Rupert: LSDV results ... – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 92
Provided by: eflap
Category:

less

Transcript and Presenter's Notes

Title: Sao Polanec Faculty of Economics University of Ljubljana saso'polanecef'unilj'si


1
Sao PolanecFaculty of EconomicsUniversity of
Ljubljanasaso.polanec_at_ef.uni-lj.si
  • Econometrics II Panel Data Analysis
  • MSc
  • Summer 2007

2
Econometrics II
  • Sao Polanec
  • Faculty of Economics,
  • University of Ljubljana
  • These lecture notes were created by William
    Greene

3
Econometrics II
  • Lecture 6
  • Panel Data

4
Motivation for Panel Data Sets
  • Cross-sectional data sets suffer from an
    important limitation
  • each unit is observed in only one particular time
    period
  • hence inference about models parameters is based
    only on cross-sectional heterogeneity of units
  • The estimates of parameters based on
    cross-sectional data may be biased
  • when there are omitted variables from the model
    and these omitted variables are correlated with
    other explanatory variables
  • The estimates of parameters based on
    cross-sectional data may be inefficient
  • when there is unobserved variance component
    specific to specific units
  • Repeated cross-sections or panel data may allow
    us to deal with both of these problems

5
Types of Panel Data Sets
  • Longitudinal data (number of different units N
    - is large, time dimension T - is short)
  • Business survey (in Slovenia AJPES data for the
    period 1989-2006 Amadeus database for EU
    countries)
  • Household panel surveys (in Slovenia and other
    countries)
  • Panel Study of Income Dynamics (PSID)
  • Cross section time series (T and N both large)
  • Grunfelds investment data
  • Penn world tables
  • Financial data by firm, year (T large and N
    relatively small)
  • CAPM rit rft ?i(rmt - rft) eit, i
    1,,many t1,many
  • Exchange rate data, essentially infinite T, large
    N
  • Effects ?i ? vi

6
Terms of Art
  • Cross sectional vs. time series variation
  • Is inference based on one or the other dimension?
    Or both in panels?
  • Heterogeneity between units of observation
  • What are different types of heterogeneity?
  • Group effects vs. individual effects
  • Fixed effects and/or random effects
  • Are there substantive differences between these
    effects?
  • Is it possible to tell them apart in observed
    data?

7
Panel Data
  • Rotating panels Typical household surveys
  • E.g. Spanish income study (http//www.cemfi.es/al
    barran/0008r.pdf)
  • Efficiency analysis Efficiency measurement in
    rotating panel data, Heshmati, A, Applied
    Economics, 30, 1998, pp. 919-930
  • Hierarchical (nested) data sets Student outcome,
    by year, district, school, teacher

8
Balanced and Unbalanced Panels
  • Distinction between two types of panel data
  • Balanced panels all units (e.g. firms,
    individuals) have the same number of time
    observations
  • Unbalanced panels units have different number
    of time observations (more frequent e.g. firms
    enter and exit people enter labor market and
    earn wages vs. people exit labor market and
    receive pensions)
  • A notation to help with mechanics
  • zi,t, i 1,,N t 1,,Ti
  • Mathematical and notational convenience
  • Number of observations in Balanced Panel NT
  • Unbalanced

9
Benefits of Panel Data
  • Time and individual variation in behavior
    unobservable in cross sections or aggregate time
    series
  • Observable and unobservable individual
    heterogeneity
  • Rich hierarchical structures (two-way panels vs.
    three way or nested panels)
  • Dynamics in economic behavior (inference on time
    variation of parameters)

10
Fixed and Random Effects
  • Unobserved individual-specific effects in
    regression Eyit xit, ci
  • Notation
  • Linear specification
  • Fixed Effects Eci Xi g(Xi) effects are
    correlated with included variables. Common
    Covxit,ci ?0
  • Random Effects Eci Xi µ effects are
    uncorrelated with included variables. If Xi
    contains a constant term, µ0 (Without loss of
    generality).
  • Common Covxit,ci 0, but Eci Xi µ is
    needed for the full model

11
Convenient Notation
  • Fixed Effects
  • Individual specific constant terms ??.
  • Random Effects
  • Compound (composed) disturbance
  • error components ??.

12
Assumptions for Asymptotics
  • Convergence of moments involving cross section
    Xi.
  • N increasing, T or Ti assumed fixed.
  • Fixed T asymptotics (see text, p. 175)
  • Time series characteristics are not relevant (may
    be nonstationary)
  • If T is also growing, need to treat as
    multivariate time series.
  • Ranks of matrices. X must have full column rank.
    (Xi may not, if Ti lt K.)
  • Strict exogeneity and dynamics. If xit contains
    yi,t-1 then xit cannot be strictly exogenous. Xit
    will be correlated with the unobservables in
    period t-1. (To be revisited later.)
  • Empirical characteristics of microeconomic data

13
The Pooled Regression using OLS
  • Presence of omitted effects (general form
    unbalanced sample)
  • Potential bias/inconsistency of OLS depends on
    fixed or random

14
OLS with fixed individual effects illustration
of bias
15
Application 1 Cornwell and Rupert Data
Cornwell and Rupert Returns to Schooling Data,
595 Individuals, 7 YearsVariables in the file
are EXP years of work experienceWKS number
of weeks worked in a given yearOCC
occupation, 1 if blue collar, 0 otherwise IND
1 if manufacturing industry, 0 otherwiseSOUTH
1 if resides in south, 0 otherwise SMSA 1 if
resides in a city (SMSA), 0 otherwise MS 1 if
married, 0 otherwiseFEM 1 if female, 0
otherwiseUNION 1 if wage set by union
contract, 0 otherwiseED years of education
(fixed over time in this data set) BLK 1 if
individual is black, 0 otherwiseLWAGE log of
wage dependent variable in regressions These
data were analyzed in Cornwell, C. and Rupert,
P., "Efficient Estimation with Panel Data An
Empirical Comparison of Instrumental Variable
Estimators," Journal of Applied Econometrics, 3,
1988, pp. 149-155.  See Baltagi, page 122 for
further analysis.  The data were downloaded from
the website for Baltagi's text.
16
Application 1 Cornwell and Rupert Data (cont.)

17
Application 1 Cornwell and Rupert Data (cont.)
  • Stata needs to know that we are dealing with
    panel data
  • Original data do not have panel structure no
    variable for time period and no variable for
    individual unit (person)
  • In order to create them, we use the following
    lines (exploiting the balanced panel structure)
  • gen idint(_n/7-0.01)1
  • bysort id gen year_n
  • Next we tell Stata that these variables are
    cross-sectional and time dimensions and use
    command for description of our balanced panel
  • iis id
  • tis year
  • xtdes
  • xt is starting name for all commands panel data
    commands in Stata


18
Application 1 Cornwell and Rupert Data (cont)
  • Lets first estimate pooled wage regression using
    standard OLS method (ignoring the panel
    structure)


19
Using First Differences
  • Fixed and random effects share the general
    specification
  • Eliminating the heterogeneity

20
First Differences in Stata Wages
  • First differences are easily created in Stata by
    writting following lines (example for logarithm
    of real wage - lwage)
  • bysort id (year) gen lwage_1lwage_n-1
  • gen dlwagelwage-lwage_1
  • Estimation of regression equation gives

21
OLS with First Differences
  • With strict exogeneity of (Xi,ci), OLS
    regression of ?yit on ?xit is unbiased and
    consistent but inefficient.

GLS is unpleasantly complicated. In order to
compute a first step estimator of se2 we would
use fixed effects. We should just stop there.
Or, use OLS in first differences and use
Newey-West with one lag.
22
Two Periods
  • With two periods and strict exogeneity,
  • This is a classical regression model. If there
    are no regressors,

23
Estimation with Fixed Effects
  • The fixed effects model
  • ci is arbitrarily correlated with xit but
    EeitXi,ci0
  • Dummy variable representation

24
Assumptions for the FE Model
  • yi Xi? diai ei, for each individual

Eci Xi g(Xi) Effects are correlated with
included variables. Common Covxit,ci ?0
25
Notation

26
Estimating the Fixed Effects Model
  • The FEM is a plain vanilla regression model but
    with many independent variables
  • Least squares is unbiased, consistent, efficient,
    but inconvenient if N is large.

27
Estimating FE model in STATA with OLS
  • Stata allows us to estimate the model with
    dummies using a special command called
    interaction expansion, which creates dummies just
    for the estimation of the model
  • xi reg lwage ed exp occ smsa ms fem wks ind
    union i.id
  • This estimation procedure is resource intensive
    for two reasons
  • its demand for memory is substantial matsize
    the number of variables in the model needs to be
    increased substantially
  • time of calculation increases with number of
    observations and may be prohibitive

28
Application 1 Cornwell and Rupert Data (cont.)
  • The results of our example of wages data after
    controlling for individual differences estimated
    coefficient for number of years of schooling
    increases (ed)

29
Useful Analysis of Variance Notation
Total variation Within groups variation
Between groups variation
30
Application 1 Analysis of Variance for Wages
  • Decomposition of total variation of log wage into
    within and between variation can be estimated
    using Stata by typing the command xtsum (see also
    xttab)
  • xtsum lwage
  • Stata reports standard deviations instead of sums
    of squares. In order to calculate sums of
    squares, we need to square the Std.Dev. and
    multiply by N. (72.8 percent is between and 27.1
    percent is within variation.)

31
The Within Transformation Removes the Fixed
Effects
32
Fixed Effects Estimator
33
Fixed Effects Estimator (cont.)
34
Fixed Effects Estimator (cont.)
35
Least Squares Dummy Variable (LSDV) Estimator
  • b is obtained by within groups least squares
    (group mean deviations)
  • Normal equations for a are DXbDDaDy. Hence,
  • a (DD)-1D(y Xb)
  • Notes This is simple algebra the estimator is
    just OLS.
  • Least squares is an estimator, not a
    model. (Repeat twice.)
  • Note what ai is when Ti 1. Follow
    this with yit-ai-xitb0 if
  • Ti1.

36
Inference About OLS
  • Assume strict exogeneity Coveit,(xjs,cj)0.
    Every disturbance in every period for each person
    is uncorrelated with variables and effects for
    every person and across periods.
  • Now, its just least squares in a classical
    linear regression model.
  • Asy.Varb

37
Application Cornwell and RupertLSDV results
(cont.)
  • In Stata we can estimate the parameters of the
    model using command xtreg
  • xtreg lwage ed exp occ smsa ms fem wks ind union,
    fe

38
Comments of results
  • Variables that do not change over time are
    dropped (ed education fem female dummy)
  • R2 within is larger than between differences
    over time for individuals are more important for
    explaining variation of wages than differences
    between individuals (fraction of variance due to
    individuals ui is 97.9 percent)
  • Wald test for exclusion of fixed effects
    F(594,3566)33.81 is large and exact probability
    P0.0000. Its calculated according to the
    standard formula
  • Interpretation of Coefficients
  • E.g. one year of additional experience increases
    wage by 9.6 percent (bexp0.096).

39
The Random Effects Model
  • The random effects model
  • ci is uncorrelated with xit for all t
  • Eci Xi 0
  • EeitXi,ci0
  • Note that this is different from fixed effects,
    where

40
Error Components Model
  • Generalized Regression Model

41
Notation
42
Notation
43
Convergence of Moments
44
Random vs. Fixed Effects
  • Random Effects
  • Small number of parameters
  • Efficient estimation
  • Objectionable orthogonality assumption (ci ? Xi)
  • Fixed Effects
  • Robust generally consistent
  • Large number of parameters

45
Ordinary Least Squares
  • Standard results for OLS in a RE model are
  • Consistent (large sample property)
  • and
  • Unbiased (small sample property)
  • but
  • Inefficient (OLS variance is too large)
  • True Variance

46
Estimating the Variance for OLS
47
Mechanics
48
Cornwell and Rupert Data (cont.)
Cornwell and Rupert Returns to Schooling Data,
595 Individuals, 7 YearsVariables in the file
are EXP work experience, EXPSQ EXP2WKS
weeks workedOCC occupation, 1 if blue collar,
IND 1 if manufacturing industrySOUTH 1 if
resides in southSMSA 1 if resides in a city
(SMSA)MS 1 if marriedFEM 1 if
femaleUNION 1 if wage set by unioin
contractED years of educationBLK 1 if
individual is blackLWAGE log of wage
dependent variable in regressions These data were
analyzed in Cornwell, C. and Rupert, P.,
"Efficient Estimation with Panel Data An
Empirical Comparison of Instrumental Variable
Estimators," Journal of Applied Econometrics, 3,
1988, pp. 149-155.  See Baltagi, page 122 for
further analysis.  The data were downloaded from
the website for Baltagi's text.
49
OLS Results Extended model with all rhs variables
Pooled OLS (random effects ignored)
50
OLS Results (Robust SE) Extended model with all
rhs variables
Pooled OLS (using robust option which gives
Huber/White standard errors) These standard
errors are typically larger (not always compare
smsa) and t-statistics lower
51
Generalized Least Squares
52
GLS (cont.)
53
Estimators for the Variances
54
Feasible GLS
x does not contain a constant term in the
preceding.
55
Practical Problems with FGLS
x does not contain a constant term in the
preceding.
56
Computing Variance Estimators
57
Estimation of Random Effects model
58
Testing for Effects Lagrange Multiplier Test
59
Application of LM test to RE estimation
Following xtreg command, we can perform Breusch
and Pagan (1980) LM test by command xttest0 Below
we see a strong rejection of this hypothesis
60
Hausman Test for FE vs. RE
Hausman test helps us to decide between fixed and
random effects specification
61
Hausman Test for Effects
ß does not contain the constant term in the
preceding.
62
Computing the Hausman Statistic
ß does not contain the constant term in the
preceding.
63
Hausman Test for Wages Application
THE PROCEDURE FOR OBTAINING HAUSMAN TEST
RESULTS xtreg lwage ed exp occ smsa ms fem wks
ind union, fe est store fixed xtreg lwage ed exp
occ smsa ms fem wks ind union, re hausman
fixed RESULTS REJECT THE NULL of no difference in
coefficients FE is adequate specification
64
Appendix Random Effects Algebra (1)
65
Appendix Random Effects Algebra (2)
66
Appendix Random Effects Algebra (2, cont.)
67
William H. GreeneStern Business SchoolNew York
University
  • 8. Instrumental Variables Estimation in Panel Data

68
Structure and Regression
69
Exogeneity
70
The Measurement Error Problem
How general is this result?
71
Instrumental Variable
  • One problem variable the last one
  • yit ?1x1it ?2x2it ?KxKit eit
  • EeitxKit ? 0. (0 for all others)
  • There exists a variable zit such that
  • ExKit x1it, x2it,, xK-1,it,zit g(x1it,
    x2it,, xK-1,it,zit)
  • In the presence of the other variables, zit
    explains xit
  • Eeit x1it, x2it,, xK-1,it,zit 0
  • In the presence of the other variables, zit
    and eit are uncorrelated.
  • A projection interpretation In the projection
  • Xkt ?1x1it, ?2x2it ?k-1xK-1,it ?K
    zit,
  • ?K ? 0.

72
Least Squares
73
The IV Estimator
74
A Moment Based Estimator
75
Consistency and Asymptotic Normality of the IV
Estimator
76
Least Squares Revisited
77
Comparing OLS and IV
78
Application Cornwell and Rupert Data
Cornwell and Rupert Returns to Schooling Data,
595 Individuals, 7 YearsVariables in the file
are EXP work experience, EXPSQ EXP2WKS
weeks workedOCC occupation, 1 if blue collar,
IND 1 if manufacturing industrySOUTH 1 if
resides in southSMSA 1 if resides in a city
(SMSA)MS 1 if marriedFEM 1 if
femaleUNION 1 if wage set by unioin
contractED years of educationBLK 1 if
individual is blackLWAGE log of wage
dependent variable in regressions These data were
analyzed in Cornwell, C. and Rupert, P.,
"Efficient Estimation with Panel Data An
Empirical Comparison of Instrumental Variable
Estimators," Journal of Applied Econometrics, 3,
1988, pp. 149-155.  See Baltagi, page 122 for
further analysis.  The data were downloaded from
the website for Baltagi's text.
79
Wage Equation with Endogenous Weeks
logWageß1 ß2 Exp ß3 ExpSq ß4OCC ß5 South
ß6 SMSA ß7 WKS e Weeks worked is believed
to be endogenous in this equation. We use the
Marital Status dummy variable MS as an exogenous
variable. Condition (5.3) CovMS, e is
assumed. Auxiliary regression In the regression
of WKS on 1,EXP,EXPSQ,OCC,South,SMSA,MS,
MS significantly explains WKS. A projection
interpretation In the projection XitK ?1 x1it
?2 x2it ?K-1 xK-1,it ?K zit , ?K ?
0. (One normally doesnt check the variables
in this fashion.
80
Auxiliary Projection (5.5)
81
Application IV for WKS in Rupert
82
Application IV for wks in Rupert
83
IV for Panel Data Fixed Effects Example
84
Comments of results
  • The first stage results suggest that marital
    status is a good instrument for the
  • The first stage results of XTIVREG suggest that
    marital status is not a statistically significant
    variable for explaining the number of weeks
    worked as it changes very infrequently.
  • In the presence of weak instruments, this is no
    surprise.

85
The Panel Data Case Hausman-Taylor model
86
Hausman and Taylor
87
Hausman and Taylor
88
HTs FGLS Estimator
89
HTs FGLS Estimator (cont.)
90
HTs 4 STEP IV Estimator
91
Stata code for Hausman-Taylor
The Stata command for estimation of the model
with structure of Hausman-Taylor model
is xthtaylor Difficulty with this method is to
find adequate instruments for both sets of
endogenous variables. In practice this turns out
to be a hard task.
Write a Comment
User Comments (0)
About PowerShow.com