Title: Alternative methods for estimating systems of health equations
1Alternative methods for estimating systems of
(health) equations
Silvia, Andrew and Casey
2Motivating example
From Balia and Jones (2005) analysis of
jointly-determined dichotomous outcomes for
health, mortality and lifestyles.
- What are we motivating? A couple of things
- A generalisable method of estimating (at least
reduced-form) systems of equations in which the
data-generation process(es) can be identified at
each equation, - A method through which information can be
retrieved from data via multivariate modelling.
Information on health, mortality and lifestyles
(eating breakfast, sleeping well, under-obese
weight, non-smoking, drinking alcohol prudently
and exercising) are taken from the original
(1984-5) Health and Lifestyle Survey with the
most recent follow-up (2005).
3Behavioural problem
Balia and Jones (2005) individuals are assumed
to maximise simultaneously the utility function
- where tth period utility is determined by
- the vector of l (in this case 6) lifestyles, Olt
- health Ht,
- exogenous variables XU, and
- unobserved µU
Time preferences are also affected by ßt with
mortality risk pt
4Econometric problem
This formulation of individual utility gives 3
elements to be estimated health, lifestyles and
mortality . In reduced-form the system to be
estimated is
Each of SAH (1 good), Mortality (1 dead) and
lifestyles breakfast (1 eating one), obese (1
not), non-smoker (1 is), prudent drinker (1
is), sleeping well (1 is) and exercise (1
undertaken) is dichotomous/dichotomised.
5Statistical problem
Estimated according to unobserved latent
variables yim, yih and yil, this presents us
with a fairly standard multivariate binary choice
model
Ordinary responses to this problem are the use of
the multivariate probit or logit.
Ours will be a multivariate copula.
6Copulas I revision
For every joint distribution H(x1,x2) whose
random variables are individually distributed
F1(x1), F2(x2) there exists a copula C, such that
H(x1,x2) C(F1(x1),F2(x2))
C is a function of F1(x1), F2(x2), not x1,x2.
Every copula is unique, and uniquely represents
the association between X1,X2.
There are many copulas C(F1(x1), F2(x2)), but
only one true joint distribution H(x1,x2), as for
F1(x1) and F2(x2) in a univariate framework.
7Copulas II example
Consider the Farlie-Gumbel-Morgenstern family.
The bivariate FGM copula/distribution is given by
The essential property of the copula is that
F1(x1) can be of any type. A secondary property
is the separation of each F1(x1) from the
dependence parameter ?. This is so for the
density also, given by
8Multivariate Copulas
In (very) basic statistical terms there are 4-5
multivariate copula(type)s available to us the
multivariate FGM, multivariate Archimedean,
Gaussian/t, Mixture-of-powers and
Mixture-of-Max-ID.
- In practical terms, however,
- The FGM and the Archimedean n-copulas are too
limiting in their capture of dependence - The Mixture-of-Powers copula is a combination of
limited and unworkable, in dimensions this high.
By reduction our analysis will be confined to the
Gaussian, t and a mixture of Max-ID copulas.
9Gaussian and t Copulas
The Gaussian copula is based on the so-called
method of inversion. It is the copula C such
that
Thus
We still retain autonomy over each marginal
distribution type, and dependence is captured
generally by Pearsons correlation between
functions (rather than random variables).
10Gaussian and t Copulas
Similarly the t-copula is H(.) such that
Essentially this is the same. Thanks to the
nature of the t-distribution however, the
multivariate t-copula (or distribution) is a mite
narrower, with more tail dependence, or rather
better capture of tail dependence, if it is
there.
11Mixture of Max-ID Copulas
The Mixture of Max-ID copula has two essential
elements the mixture part, and the Max-ID part.
Max-Infinite Divisibility equivalent to Total
Positivity of Order 2 where, for all x1 lt y1 and
x2 lt y2 with x, y ? R, any non-negative function
g will give
g(x1, x2)g(y1,y2) gt g(x1, y2)g(x2,y1)
I.e. we demand strict monotonicity.
12Mixture of Max-ID Copulas
The mixture is a method by which Laplace
transformations are used to construct
multivariate distributions from univariate (the
mixture of powers) or lower-dimensionally
multivariate (the mixture of Max-ID) ones.
For some CDF M of a positive random variable a,
and Laplace transform ?, the multivariate
distribution function F can be written in terms
of some max-ID distribution H, thus
Although we dont need to know this, necessarily.
13Mixed-max-ID Frank copula
The mixed-distribution is then
The mixture-of-max-ID copula used in this
analysis is a mixture of the Frank copula
Using Joes Laplace transform A
14Mixed-max-ID Frank copula
Finally the copula is
Although we are more likely to use
15(non-)normality I
The desire to go to all this trouble is
(potentially) several-fold (two, at least). One
is control over univariate random variation, the
other is the multivariate behaviour.
These arguments tend to be relative to
(multivariate) normality. Although the normality
assumption is fairly robust to departures from
normality, it is less so with higher dimensions.
In systems of (structural) equations, such Balia
and Jones multivariate non-normality can also
lead to erroneous rejection of some models within
the structure.
As well as protecting against this, sometimes
information at hand suggest non-normality (in
dichotomous-variables estimation, we may have
reason to prefer logit over a probit, for some
margins
16(non-)normality II
- In a multivariate model, the nature of the
multivariate association is of (more or less)
importance - even entirely normally-distributed margins may
not have a multivariate normal distribution,
necessarily - many (non-normal) distributions do not extend to
a multivariate form (e.g. the Beta) - in our analysis, we may prefer the logit but
face its multivariate correlation restrictions - or want the complementary log-logs asymmetry
but have no multivariate distribution.
17Marginal distributions I
In fact we find no consistent results favouring
univariate probits in our multivariate model
18Marginal distributions III
Nor do we consider a multivariate (latent)
logistic framework preferable
19Marginal distributions II
20Marginal distributions IV
21Marginal distributions V
Ergo we would like to accommodate H(ym, yh, yl1,
yl2, yl3, yl4, yl5, yl6) such that
F1(ym) Logit F2(yh) Complementary
log-log F3(yl1) Complementary log-log F4(yl2)
Logit F5(yl3) Logit F6(yl4) Complementary
log-log F7(yl5) Logit F8(yl6) Complementary
log-log
22Univariate estimates I
What will this mean for results and
interpretations?
Pr (drinking prudently) marginal effects
23Univariate estimates II
What will this mean for results and
interpretations?
Pr (drinking prudently) p-values
24Estimating copulas
Ordinarily, the method of maximum likelihood is
to solve (?L/?ß1,..., ?L/?ßn, ?L/??) 0.
We use the method of Inference Functions for
Margins (IFM) to solve (?L1/?ß1 0,..., ?Ln/?ßn
0, ?L/?? 0).
- This is then a 2-step procedure in which we
- First maximise each L1(ß1),..., Ln(ßn),
retrieving ML estimates. - Then maximise L(ß1,..., ßn, ?) to get the ML
estimate of ?.
This facilitates the use of the Gaussian/t
copulas, as well as simplifying analysis of
ordinary copulas.
25Estimating the Gaussian
In accordance with the method of IFM, the
predicted probabilities (of choice) from each
margin is used
e.g.
26Estimating the t
The t-copula is similarly determined
e.g.
27Estimating the Max-ID I
The Max-ID copula requires a 3-step procedure,
which is achieved with some statistical
sleight-of-hand
Also a survival function for mortality.
28Estimating the Max-ID II
- For this
- First estimate the margins, as before,
- Then estimate the bivariate margins separately,
replacing ? with some constant (gt 1), - Finally, replace ? as a random variable, with
numerical values for ?, and get the ML estimate
of global dependence.
What are the implications for bootstrapping the
standard errors of ß? Unknown.
Bootstrapping may not be employed analytic
derivatives will be available (maybe).
29Comparing models I
Now the use of information criteria are more
informative.
30Comparing models II
31Univariate estimates II
What will this mean for results and
interpretations?
Pr (drinking prudently) p-values
32Further analysis
Going to try out-of-sample predictions.
- Copulas allow us to get away with a lot can we
get away with more? Such as - Hazard models for death,
- Ordered models for health,
- Count models for lifestyles
33Marginal distributions IV
34Marginal distributions II