Title: Multivariate 2 Level Generalised Linear Models
1Multivariate 2 Level Generalised Linear Models
2Multivariate 2 Level Generalised Linear Models
- We now introduce the superscript r to enable us
to distinguish the different models, variates,
random effects etc of a multivariate response. - Cameron and Trivedi (1988) use various forms of
overdispersed Poisson model to study the
relationship between type of health insurance and
various responses which measure the demand for
health care, e.g. number of consultations with a
doctor or specialist and the number of
prescriptions - An event history example occurs in the modelling
the sequence of months of job vacancies, which
last until either they are successfully filled or
withdrawn from the market. This data leads to a
correlated competing risk model as the firm
effects are present in both the filled and lapsed
durations,
3Multivariate 2 Level Generalised Linear Models
- A trivariate example is the joint (simultaneous
equation) modelling of wages, training and
promotion of individuals over time present in
a panel survey such as the British Household
Panel Survey (BHPS) - Joint modelling of simultaneous responses allows
us to disentangle the direct effects of the
different responses on each other from any
correlation that occurs in the random effects. - Without a multivariate multilevel GLM for complex
social process like these we risk inferential
errors.
4Multivariate 2 Level Generalised Linear Models
- The multivariate GLM is obtained from the
univariate GLM (using superscipts) - scale parameter
- conditional mean
- Variance
- linear predictor
5Multivariate 2 Level Generalised Linear
ModelsLikelihood
- is a multivariate Normal distribution
of dimension R with mean zero and variance
covariance structure
6Example Bivariate Poisson model
- Cameron and Trivedi (1988,1998) example is from
the Australian Health survey for 1977-1978. - We use a version of the Cameron and Trivedi
(1988) data set (called racd.tab) for a bivariate
model. In this example we have a single response
as we only have one pair of response ( dvisits,
prescrib) for each sampled individual. - Data description
- Number of observations (rows) 5190
- Number of variables (columns) 21
7Poisson Model Example C6
- Variables
- sex 1 if respondent is female, 0 if male
- age respondent's age in years divided by 100,
- agesq age squared
- income respondent's annual income in Australian
dollars divided by 1000 - levyplus 1 if respondent is covered by private
health insurance fund for private patient in
public hospital (with doctor of choice), 0
otherwise - freepoor 1 if respondent is covered by
government because low income, recent immigrant,
unemployed, 0 otherwise - freerepa1 if respondent is covered free by
government because of old-age or disability
pension, or because invalid veteran or family of
deceased veteran, 0 otherwise - illness number of illnesses in past 2 weeks
with 5 or more coded as 5 - actdays number of days of reduced activity in
past two weeks due to illness or injury - hscore respondent's general health
questionnaire score using Goldberg's method, high
score indicates bad health. - chcond1 1 if respondent has chronic
condition(s) but not limited in activity, 0
otherwise - chcond2 1 if respondent has chronic
condition(s) and limited in activity, 0 otherwise - dvisits number of consultations with a doctor
or specialist in the past 2 weeks - nondocco number of consultations with
non-doctor health professionals, (chemist,
optician, physiotherapist, social worker,
district community nurse, chiropodist or
chiropractor in the past 2 weeks - hospadmi number of admissions to a hospital,
psychiatric hospital, nursing or convalescent
home in the past 12 months (up to 5 or more
admissions which is coded as 5) - hospdays number of nights in a hospital, etc.
during most recent admission, in past 12 months - medicine total number of prescribed and
nonprescribed medications used in past 2 days - prescribe total number of prescribed
medications used in past 2 days
8Bivariate Poisson Model Example C6
9Demo example c6
10Results Bivariate Model (racd.dat)
- This shows different level of overdispersion in
the different responses and a large correlation
between the random intercepts. - If we had not been interested in obtaining the
correlation between the responses we could have
done a separate analysis of each response and
made adjustments to the SEs. - This is legitimate here as there are no
simultaneous direct effects (e.g. visits on
prescribe) in this model
11Bivariate linear and probit Example L9
- The data we use is a version of the NLSY data as
used in various Stata Manuals (to illustrate the
xt commands). The data is for young women who
were aged 14-26 in 1968. - The women were surveyed each year from 1970 to
1988, except for 1974, 1976, 1979, 1981, 1984 and
1986. - We have removed records with missing values on
one or more of the response and explanatory
variables we want use in our analysis of the
joint determinants of wages and trade union
membership. - There are 4132 women (idcode) with between 1 and
12 years of observation on wages being in
employment (i.e. not in full time education) and
earning more than 1/hour but less than
700/hour. - The direct effect of trade union membership on
wages is dealt with including trade union
membership as a covariate in the wage equation
linear predictor.
12Path Diagrams
- This picture shows the dependence between trade
union membership and wages, there are no
multilevel random effects affecting either wages
or trade union membership. This model can be
estimated by any software that estimates basic
GLMs. - This picture also also shows the dependence
between trade union membership and wages, this
time there are multilevel random effects
affecting both wages and trade union membership.
However the multilevel random effects are
independent This model can be estimated by any
software that estimates multilevel GLMs by
treating the wage and trade union models as
independent.
13Path Diagrams
- This picture shows the dependence between trade
union membership and wages, this time there is a
correlation between the multilevel random effects
affecting wages and trade union membership, this
is shown by the curved line linking them
together. This model can be estimated by Sabre as
a bivariate multilevel GLMs by allowing for a
correlation between the wage and trade union
responses at each wave of the panel
14Bivariate linear and probit Example L9
- Data description
- Number of observations 18995
- Number of cases (columns) 4132
- Variables include
- ln_wageln(wage/GNP deflator) in a particular
year are - black1 if woman is black, 0 otherwise
- msp1 if woman is married and spouse is present,
0 otherwise - grade years of schooling completed (0-18)
- not_smsa1 if woman was living outside a standard
metropolitan statistical area (smsa), 0
otherwise - south1 if the woman was living in the South, 0
otherwise - union1 if a member of a trade union, 0
otherwise - tenure job tenure in years (0-26).
- age respondents age
- age2 age age
15Bivariate linear and probit Example L9
- We take ln_wage (linear model) and union (probit
link) as the response variables and model them
with a randon intercept and a range of
explanatory variables.
16Demo Bivariate linear and probit Example L9
- Model for tunion on its own
- Model for ln_wage on its own
- Then estimate a joint model allowing for the
overdispersion in ln_wage and tunion and a
correlation between them, - Also the log wage equation contains union as an
explanatory variable.
17Bivariate linear and probit Example L9
- This shows different levels of overdispersion in
the different responses and a positive
correlation between the random intercepts. - The value of trade union membership in the wage
equation of the homogenous model changes - Sabre 5.0 can model up to 3 different panel
responses simultaneously.
18Exercise
- There is an exercise to accompany this section,
this is the bivariate linear and logit exercise
L10.