SC968: Panel Data Methods for Sociologists - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

SC968: Panel Data Methods for Sociologists

Description:

We started off with fixed effects models that pooled data across many waves of ... Period and age effects not completely confounded as in a birth cohort design ... – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 55
Provided by: mariai
Category:

less

Transcript and Presenter's Notes

Title: SC968: Panel Data Methods for Sociologists


1
SC968 Panel Data Methods for Sociologists
  • Random coefficients models

2
Overview
  • Random coefficients models
  • Continuous data
  • Binary data
  • Growth curves

3
Random coefficients models
  • Also known as
  • Multilevel models
  • MLwiN
  • http//www.cmm.bristol.ac.uk/
  • Hierarchical models
  • HLM
  • http//www.ssicentral.com/

4
Random coefficients modelsfor continuous outcomes
5
Random coefficients models
  • We started off with fixed effects models that
    pooled data across many waves of panel data
  • Then we allowed intercepts to vary for each
    individual using random effects models
  • We can also allow the coefficients for the
    independent variables to vary for each individual
  • These models are called random coefficients or
    random slopes models

6
Possible combinations of slopes and intercepts
The assumptions required for this model are
unlikely to hold
The fixed effects model
Constant slopes Constant intercept
Constant slopes Varying intercepts
Separate regression for each individual
Unlikely to occur
Varying slopes Constant intercept
Varying slopes Varying intercepts
7
Partitioning unexplained variance in a random
coefficients model
8
Random coefficients model for continuous data
Fixed coefficients
Residual
Random coefficients
9
Random coefficients model for continuous data
Fixed intercept
Random intercept
Random slope
Random error
Fixed slope
10
Worked example
  • Random 20 sample from BHPS
  • Waves 1 - 15
  • Ages 21 to 59
  • Outcome GHQ likert scores
  • Explanatory variable household income last month
    (logged)

11
Random coefficients model example
  • where
  • yij GHQ score for subject i, j 1,, J
  • xij logged household income in month to
    wave j
  • ß1 mean slope
  • bi subject-specific random deviation from
    mean slope
  • ui subject-specific random intercept

12
Linear random coefficients model
  • Stata output

13
. xtmixed hlghq1 lnfihhmn pid lnfihhmn, mle
cov(unstr) variance Mixed-effects ML regression
Number of obs
18541 Group variable pid
Number of groups 2508
Obs per group
min 1
avg 7.4

max 15
Wald chi2(1)
28.11 Log likelihood -55286.004
Prob gt chi2 0.0000 --------------
--------------------------------------------------
-------------- hlghq1 Coef. Std.
Err. z Pgtz 95 Conf.
Interval ---------------------------------------
--------------------------------------
lnfihhmn -.4015666 .075741 -5.30 0.000
-.5500162 -.253117 _cons
14.40864 .5917387 24.35 0.000 13.24885
15.56843 --------------------------------------
---------------------------------------- ---------
--------------------------------------------------
------------------- Random-effects Parameters
Estimate Std. Err. 95 Conf.
Interval ---------------------------------------
-------------------------------------- pid
Unstructured
var(lnfihhmn) 2.073304 .3231129
1.527594 2.813961 var(_cons)
144.5579 19.37199 111.1664
187.9793 cov(lnfihhmn,_cons)
-16.63265 2.488823 -21.51065
-11.75465 ---------------------------------------
--------------------------------------
var(Residual) 18.10746 .2092684
17.70191 18.5223 -----------------------------
-------------------------------------------------
LR test vs. linear regression chi2(3)
5099.77 Prob gt chi2 0.0000
Random slopes
Fixed effect
Estimates covariance between all random
effects Least restrictive model
14
. xtmixed hlghq1 lnfihhmn pid lnfihhmn, mle
cov(unstr) variance Mixed-effects ML regression
Number of obs
18541 Group variable pid
Number of groups 2508
Obs per group
min 1
avg 7.4

max 15
Wald chi2(1)
28.11 Log likelihood -55286.004
Prob gt chi2 0.0000 --------------
--------------------------------------------------
-------------- hlghq1 Coef. Std.
Err. z Pgtz 95 Conf.
Interval ---------------------------------------
--------------------------------------
lnfihhmn -.4015666 .075741 -5.30 0.000
-.5500162 -.253117 _cons
14.40864 .5917387 24.35 0.000 13.24885
15.56843 --------------------------------------
---------------------------------------- ---------
--------------------------------------------------
------------------- Random-effects Parameters
Estimate Std. Err. 95 Conf.
Interval ---------------------------------------
-------------------------------------- pid
Unstructured
var(lnfihhmn) 2.073304 .3231129
1.527594 2.813961 var(_cons)
144.5579 19.37199 111.1664
187.9793 cov(lnfihhmn,_cons)
-16.63265 2.488823 -21.51065
-11.75465 ---------------------------------------
--------------------------------------
var(Residual) 18.10746 .2092684
17.70191 18.5223 -----------------------------
-------------------------------------------------
LR test vs. linear regression chi2(3)
5099.77 Prob gt chi2 0.0000
Fixed coefficient
Fixed intercept
15
. xtmixed hlghq1 lnfihhmn pid lnfihhmn, mle
cov(unstr) variance Mixed-effects ML regression
Number of obs
18541 Group variable pid
Number of groups 2508
Obs per group
min 1
avg 7.4

max 15
Wald chi2(1)
28.11 Log likelihood -55286.004
Prob gt chi2 0.0000 --------------
--------------------------------------------------
-------------- hlghq1 Coef. Std.
Err. z Pgtz 95 Conf.
Interval ---------------------------------------
--------------------------------------
lnfihhmn -.4015666 .075741 -5.30 0.000
-.5500162 -.253117 _cons
14.40864 .5917387 24.35 0.000 13.24885
15.56843 --------------------------------------
---------------------------------------- ---------
--------------------------------------------------
------------------- Random-effects Parameters
Estimate Std. Err. 95 Conf.
Interval ---------------------------------------
-------------------------------------- pid
Unstructured
var(lnfihhmn) 2.073304 .3231129
1.527594 2.813961 var(_cons)
144.5579 19.37199 111.1664
187.9793 cov(lnfihhmn,_cons)
-16.63265 2.488823 -21.51065
-11.75465 ---------------------------------------
--------------------------------------
var(Residual) 18.10746 .2092684
17.70191 18.5223 -----------------------------
-------------------------------------------------
LR test vs. linear regression chi2(3)
5099.77 Prob gt chi2 0.0000
Random slope
Random intercept
Covariation between random intercept and random
slope
16
Post estimation predictions
  • Stata output

17
Post estimation predictions random coefficients
. predict re_slope re_int, reffects
----------------------------------
pid re_int re_slope
---------------------------------- 1.
10019057 -4.833731 .4040526 4.
10028005 -5.241494 .3758182 16.
10042571 -1.442705 .1409662 17.
10051538 -2.836288 .305106 35.
10059377 5.487209 -.5189736
----------------------------------
18
Predicted individual regression lines
. gen intercept _b_cons re_int . gen slope
_blnfihhmn re_slope
---------------------------------
pid intercept slope
---------------------------------
10019057 9.574909 .002486
10028005 9.167147 -.0257484
10042571 12.96594 -.2606004
10051538 11.57235 -.0964606
10059377 19.89585 -.9205403
---------------------------------
19
Partitioning unexplained variance in a random
coefficients model
20
Calculating the variance partition coefficient
  • Random intercepts model

Between variance
Total variance i.e. Between Within
21
Calculating the variance partition coefficient
  • Random slopes model
  • At the intercept, 0
  • So, the VPC for the random slopes model reduces
    to the same as the random intercepts model

22
Variance partition coefficient for our example
Tentative interpretation least variability in
GHQ for those on average incomes
23
Random coefficients models
  • Categorical outcomes

24
Random coefficients model for binary data
  • where
  • ßk is the mean coefficient or fixed
    effect of covariate k
  • bik is a subject-specific random deviation
    from mean coefficient
  • ui is a subject-specific random intercept with
    mean zero

25
Logistic random coefficients example
  • where
  • yij binary GHQ score for subject i, j
    1,, J
  • xij employment status in wave j
  • ß1 mean slope
  • bi subject-specific random deviation from
    mean slope
  • ui subject-specific random intercept

26
Worked example
  • Random 20 sample
  • 15 waves of BHPS
  • Ages 21 to 59
  • Outcome GHQ binary scores (psychological
    morbidity cases hlghq2 gt 2)
  • Explanatory variable employment status (jbstat
    recoded to employed/unemployed/olf)

27
Logistic random coefficients model
  • Stata output

28
(No Transcript)
29
No constant term with odds ratios
30
No random residual for logit model
31
(No Transcript)
32
Random coefficients models for development over
time
33
Growth curve models
  • Models change over time as a continuous
    trajectory
  • Suitable for research questions such as
  • What is the trajectory for the population?
  • Are there distinct trajectories for each
    respondent?
  • If individuals have distinct trajectories, what
    variables predict these individual trajectories?

34
Linear growth curve model
  • Individual growth curves
  • t 0 at baseline and 1,2,3 .,T in successive
    waves
  • Mean population growth curve

35
Worked example
  • Random 20 sample from BHPS
  • Waves 1 - 15
  • All respondents over 16 years
  • Outcome self-rated health (hlstat)
  • Linear growth wave 1 to wave 15

36
(No Transcript)
37
(No Transcript)
38
Slope (change in health over time)
Intercept (mean health at baseline)
39
Individual differences in health change
Individual differences in baseline health
40
(No Transcript)
41
Adding time invariant covariates
42
Interacting gender with time
43
Adding time varying covariates
44
Beyond linear change
  • Polynomial trajectories
  • Quadratic or cubic trajectories
  • Piecewise linear trajectories
  • Exponential trajectories

45
Non linear growth curves
46
Specifying time
  • Metrics of time
  • Wave of assessment
  • Chronological age
  • Time before/after an event
  • Individually varying values of time

47
Age-period-cohort
  • A cohort is defined by their age in a particular
    period
  • Impossible to separate age, period and cohort
    effects
  • But any pair of the 3 factors are independent
  • If wave is our metric of time, then usual to
    control for age at baseline (i.e. cohort)
  • If age is our metric of time, then can control
    for period (i.e. wave) effects using dummy
    variables

48
Accelerated panel designs
  • Respondents of varying age sampled at same
    time-point then followed for several years
  • Period and age effects not completely confounded
    as in a birth cohort design
  • Assumption, after controlling for period growth
    curves for each cohort overlap and form smooth
    curve

49
Example of an accelerated panel design
  • Random 20 sample from BHPS
  • Waves 1 - 15
  • All respondents over 16 years
  • Outcome self-rated health (hlstat)
  • Quadratic growth by age
  • Controlling for period effects (wave)

50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
When are random coefficients not necessary?
  • When number of units at level 2 is small
  • Use dummy variables with OLS
  • When want to correct for correlation between
    observations but not interested in variation
    between and within subjects
  • Use fixed effects model
  • When correlation within subjects is small
  • Using OLS when ICC large will give similar
    estimates

54
Finally.
  • Random coefficients models can take a very long
    time to estimate
  • Can speed things up by collapsing data and using
    frequency weights
  • My personal recommendation is to use MLwiN
  • Excellent online training material
  • Easier to build up model step-by-step
Write a Comment
User Comments (0)
About PowerShow.com