Title: State Dependence, Unobserved Heterogeneity and Non-Stationarity in Panel Data
1State Dependence, Unobserved Heterogeneity and
Non-Stationarity in Panel Data
2Importance of Longitudinal and panel data
- they provide scope for extending control for
variables that have been omitted from the
analysis, (differencing provides a simple way of
removing time constant effects (omitted and
observed) from the analysis. - Much of human behaviour is influenced by previous
behaviour and outcomes, that is, there is
positive feedback' (e.g. the McGinnis (1968)
axiom of cumulative inertia').
3Heckman (2001) Nobel Prize speech
- a frequently noted empirical regularity in the
analysis of unemployment data is that those who
were unemployed in the past or have worked in the
past are more likely to be unemployed (or
working) in the future - is this due to a causal effect of being
unemployed (or working) or is it a manifestation
of a stable trait?
4SD H and NS
- Inference about feedback effects (SD) are
particularly prone to bias if the additional
variation due to omitted variables (stable trait)
are ignored. - With dependence upon previous outcome, the
explanatory variables representing the previous
outcome will, for structural reasons, normally be
correlated with omitted explanatory variables (H)
and therefore always be subject to bias using
conventional modelling methods. - The situation is further complicated by changes
in the scale and relative importance of the
systematic relationships over time (NS).
5Initial Conditions
- Most observational schemes for collecting panel
and other longitudinal data commence with the
process already under way. - They will therefore tend to have an informative
start the initial observed response is typically
dependent upon pre-sample outcomes and unobserved
variables.
6Depression Example
- One-year panel study of depression and
help-seeking behaviour in Los Angeles (Morgan et
al, 1983). - Adults were interviewed during the spring and
summer of 1979 and re-interviewed at 3-monthly
intervals. - A respondent was classified as depressed if they
scored gt16 on a 20-item list of symptoms.
7Depression Example
8Depression Example
- Depression is difficult to overcome suggesting
that state dependence might explain at least some
of the observed temporal dependence, although it
remains an empirical issue whether true contagion
extends over three months. - We might also expect seasonal effects due to the
weather. - What is the relative importance of state
dependence (1st order Markov), non-stationarity
(seasonal effects) and unobserved heterogeneity
9Likelihood
- subject-specific unobserved effects integrated
out - acknowledges the possibility that the multilevel
effects can depend on the regressors
10Linear predictor of the model
- Also change g(.) to acknowledge 1st Order SD
11Initial Condition
- the data window usually samples an ongoing
process and the information collected on the
initial observation rarely contains all of the
pre-sample response sequence and its determinants
back to inception. - Need a model for the initial observed response
- Joint Likelihood with a common random effect
12Conditional Likelihood
- If we omit the 1st term on the RHS of this
formulation, we have conditioning on the initial
response. - The data window interrupts an ongoing process,
the initial observation will, in part, be
determined by H and the simplification may induce
inferential error.
13Conditional Likelihood (1) Naïve Model
14Depression Conditional Likelihood (1)
15Depression example Usual Conditional Likelihood
- The coefficient on (s.lag1) is 0.94558 (s.e.
0.13563), which is highly significant, - scale parameter is of marginal significance,
suggesting a nearly homogeneous first order
model. - Can we trust this inference?
16Conditional Likelihood (2) random effect
dependent on initial response, Wooldridge
17Conditional Likelihood (2)
18Conditional Likelihood (2)
- The coefficients on the constant and the
time-constant covariates will be confounded. - For binary initial responses only one parameter
is needed for y1j, but for the linear model and
count data, polynomials in y1j, may be needed to
account more fully for the dependence.
19Depression Conditional Likelihood (1)
20Depression example(2)
- This has the lagged response s.lag1 estimate at
0.43759E-01 (0.15898), which is not significant,
while the initial response s1 estimate 1.2873
(0.19087) and the scale parameter estimate
0.88018 (0.12553) are highly significant. - There is also a big improvement in the
log-likelihood over the model without s1 of
73.62 for 1 df. This model has no time-constant
covariates to be confounded by the auxiliary
model and suggests that depression is a zero
order process.
21Modelling the Initial Conditions
- use the same random effect in the initial and
subsequent responses, e.g. Crouchley and Davies
(2001) - use a one-factor decomposition for the initial
and subsequent responses, e.g. Heckman (1981a),
Stewart (2007) - use different (but correlated) random effects for
the initial and subsequent responses - embed the Wooldridge (2005) approach in joint
models for the initial and subsequent responses.
22(1) Same random effect common scale parameter
likelihood
23(1) Same random effect common scale parameter
likelihood
- To set this model up in Sabre 5.0 we combine the
linear predictors by using dummy variables so
that for
24Depression Same Scale Joint Likelihood (1)
25Depression Joint Likelihood (1)
- The coefficient of r2s.lag1 is 0.70228E-01
(0.14048) suggesting that there is no state
dependence in these data, while the scale
coefficient 1.0372 (0.10552) suggests
heterogeneity.
26(2) Same random effect but with different scale
parameters
- This model can be derived from a one-factor
decomposition of the random effects for the
initial and subsequent observations for its use
in this context see Heckman (1981a) and Stewart
(2007). - The likelihood is like that for the common scale
parameter model, except that for i1 - For the rest we have
27Stewart (2007) parameterization
- for i1
- for igt1
- As in the common scale and all the joint models
28Depression Different Scale, Joint Likelihood (2)
29(3) Different (but correlated) random effects
Likelihood
30(3) Different (but correlated) random effects
- The scale parameter for the initial response is
not identified in the presence of r in the
binary or linear models - So in these models we hold it at the same value
as that of the subsequent responses. - Again
31Depression Different (but correlated) random
effects (3)
32(4) Depression Different (but correlated) random
effects
- Note that the log likelihood is exactly the same
as for the previous model - The scale2 parameter from the previous model has
the same value as the scale parameter of the
current model. - The lagged response r2s.lag1 has an estimate of
0.50313E-01 (0.15945), which is not significant. - The correlation between the random effects (corr)
has estimate 0.97089 (0.10093), which is very
close to 1 suggesting that the common random
effects, zero order, single scale parameter model
is to be preferred.
33(4) Embed the Wooldridge (2005) approach in the
joint models
- We can include the initial response in the linear
predictors of the subsequent responses of any of
the joint models, but for simplicity we will use
the single random effect single scale parameter
model. - The likelihood for this model is
34(4) Embed the Wooldridge (2005) approach in the
joint models
- Linear Predictors
- For all i
35Depression Wooldridge (2005) approach in the
joint models(4)
36(4) Embed the Wooldridge (2005) approach in the
joint models
- This joint model has both the lagged response
r2s.lag1 estimate of 0.61490E-01 (0.15683) and
the baseline/initial response effect r2s.base
estimate of -0.33544E-01 (0.26899) as being
non-significant. - If we estimate a standard zero order model with
dummy variables for seasons 2, 3, and 4 to all
the data we get
37O order Model
380 Order Model
- This model, without any state dependence,
suggests that the worst seasons are t3 (autumn)
and t4 (winter). - This model also has a good fit to the data, the
log L(Data)-1141.54, so the ChiSq (goodness of
fit to the data) is 3.119 (10df)
39Other link functions
- State dependence can also occur in Poisson and
linear models. For a linear model example, see
Baltagi and Levin (1992) and Baltagi (2005).
These data concern the demand for cigarettes per
capita by state for 46 American States. - We have found first order state dependence in the
Poisson data of Hall et al (1984), Hall,
Griliches and Hausman (1986). The data refer to
the number of patents awarded to 346 firms each
year from 1975 to 1979.
40Exercises
- Exercise FOL1 Trade Union Membership 1980-1987 of
young males, Wooldridge (2005) - Exercise FOL2 Probit model of union membership of
females, Stewart (2006) - Exercise FOL3 Binary Response, Female Labour
Force Participation in the UK