Panel Data - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Panel Data

Description:

Large N and small T (Traditional Panel Data) ... Theoretically, when N is large and T is small but greater than 2, FE is more ... – PowerPoint PPT presentation

Number of Views:145
Avg rating:3.0/5.0
Slides: 17
Provided by: economicsa
Category:
Tags: data | large | panel

less

Transcript and Presenter's Notes

Title: Panel Data


1
Panel Data
  • Course Applied Econometrics
  • Lecturer Zhigang Li

2
Outline
  • Panel Data
  • Fixed-effects vs. random-effects
  • First-differencing or fixed-effects
  • Strict Exogeneity Assumption

3
Panel Data (or Longitudinal Data)
  • A typical panel data set has both a
    cross-sectional dimension and a time series
    dimension. In particular, the same
    cross-sectional units (e.g. individuals,
    families, firms, cities, states) are observed
    over time.
  • Panel data is different from pooling
    independent cross sections across time (or
    pooled OLS). Estimating the latter is a simple
    extension of OLS.

4
Large N or Large T?
  • N is the number of cross-sectional units and
    T is the number of time periods.
  • Small N and small T (of little use)
  • Large N and small T (Traditional Panel Data)
  • N is large enough for the Law of Large Numbers to
    apply while T is not.
  • Convenient to use if cross-sectional units are
    independent.
  • Small N and Large T
  • T is large enough for the Law of Large Numbers to
    apply while N is not.
  • Autocorrelation has to be addressed.
  • Large N and Large T (Still under exploration)

5
Fixed Effects Panel-data Model(individual-specifi
c intercepts)
  • yitß0ßtß1xit1ß2xit2aiuit
  • Strict Exogeneity Assumption
  • Cov(Xit,uis)0 for all t and s
  • Ruling out dynamic models, which have lagged
    dependent variables (e.g. yi,t-1) as explanatory
    variables. Models with the lags of dependent
    variables as ind. Var. are still fine.
  • The effects of time-constant independent
    variables can not be directly estimated because
    they are mixed in ai
  • ßt (time-specific intercepts) controls for common
    shocks to all agents at period t.

6
Names
  • The individual-specific intercept ai may be
    called ai fixed effect or unobserved
    heterogenity.
  • The term uit is called idiosyncratic error.
  • The sum aiuit is often called the composite
    error.
  • If Cov(Xit,ai) is nonzero but the pooled OLS
    method is used, estimates of all parameters might
    be biased.This bias can be called heterogeneity
    bias.
  • Balanced Panel indicates panel data with
    observations for the same time periods for all
    individuals. Otherwise, the data are unbalanced.

7
Random Effects Models
  • yitß0ßtß1xit1ß2xit2aiuit
  • Key assumption
  • ai is uncorrelated with each explanatory variable
    in all time periods.
  • Difference between RE and FE estimators
  • In FE, we effectively control for ai using dummy
    variables.
  • In RE, ai is omitted and is part of the
    disturbance
  • RE estimates are more efficient (or more precise)
    if the RE assumption is valid.

8
Random Effects Models (continued)
  • Difference between RE and pooled OLS
  • Since ai is in the error term, observations over
    time are correlated for the same individual i
  • In RE approach, the correlation over time is
    eliminated using some sophisticated GLS
    (generalized least square) method.
  • In pooled OLS, the GLS correction is not used.
  • Hauman test
  • Compare the RE and FE estimates, if the estimates
    are very different, then the RE assumption is
    probably invalid. In this case FE has to be used.
    Otherwise, RE is more efficient.

9
Estimation of the Fixed-effect Panel Data Model
  • Fixed-effects (or Within) Estimator
  • Each variable is demeaned (i.e. subtracted by its
    average)
  • Dummy Variable Regression (i.e. put in a dummy
    variable for each cross-sectional unit, along
    with other explanatory variables.) This may cause
    estimation difficulty when N is large.
  • First-difference Estimator
  • Each variable is differenced once over time, so
    we are effectively estimating the relationship
    between changes of variables.

10
First Differencing or Fixed-Effect?
  • Theoretically, when N is large and T is small but
    greater than 2, FE is more efficient when uit are
    serially uncorrelated while FD is more efficient
    when uit follows a random walk.
  • When T is large and N is small
  • FD has advantage for processes with large
    positive autocorrelation. FE is more sensitive to
    nonnormality, heteroskedasticity, and serial
    correlation in the idiosyncratic errors.
  • On the other hand, FE is less sensitive to
    violation of the strict exogeneity assumption. So
    FE is preferred when the processes are weakly
    dependent over time.

11
With Classical Measurement Errors
  • When T2, the measurement error bias using FE
    estimator may be smaller than that with FD
    approach but higher than that with OLS.
    (Griliches and Hausman, 1986)
  • Natural IV for Measurement Error Lagged
    dependent variables

12
Violation of the Strict Exogeneity Assumption
  • Parameter estimates are inconsistent, natural
    experiment approach (e.g. IV) is needed.

13
With Strict Exogeneity and Dependent Observations
  • Parameter estimates are consistent
  • Standard errors estimates could still be biased
  • Cross-sectional correlation or serial correlation
    (over time) in error terms
  • Heteroskedasticity

14
Possible Solutions (Need Large N and Zero
Cross-Sectional Correlation)
  • Heteroskedasticity
  • Use White robust standard errors
  • Autocorrelation
  • Group the sample time dimension into two periods
    and apply the first-difference estimator (need
    large N). (Perform the best with D-in-D approach
    by Bertrand et al. 2004)
  • Clustered robust errors
  • Newey-West standard errors (which also accounts
    for heteroskedasticity)
  • Cross-sectional Correlations
  • Clustered robust errors

15
Clustered Standard Errors
  • Key Assumption
  • Correlations within a cluster (a group of firms,
    a region, different years for the same firm,
    different years for the same region) are the same
    are the same for different observations.
  • Procedure
  • Identify clusters using economic theory
    (clustered by industry, year, industry and year)
  • Let computer calculate clustered standard errors
  • Try different ways of defining clusters and see
    how estimated standard errors are affected.

16
Unbalanced Panels
  • If a panel data set is unbalanced for reasons
    uncorrelated with uit, estimation consistency
    using FE will not be affected
  • The attrition problem If an unbalanced panel
    is a result of some selection process related to
    uit, then endogeneity problem is present and need
    to be dealt with using some correction methods.
Write a Comment
User Comments (0)
About PowerShow.com