Title: BOULDER WORKSHOP STATISTICS REVIEWED: LIKELIHOOD MODELS
1BOULDER WORKSHOPSTATISTICS REVIEWED LIKELIHOOD
MODELS
2PRE-HISTORY (STATISTICS 101)
- Binomial distribution gives the probabilities
of various possible numbers of successful
outcomes in a fixed number of discrete trials,
where all trials have the same probability of
success. - Probability that X is equal to a particular value
x (x 0, 1, 2n) is given by -
- Useful for genetics (e.g. transmission versus
non-transmission of an allele)!
3LIKELIHOOD
- Focus on probability (likelihood) of the data,
as a function of model parameters. - E.g. Sham (1998) uses Neel Schull (1954) data
on opalescent dentine in random sample of 112
offspring of an affected parent, found 52
affected, 60 normal. Compatible with hypothesis
of rare autosomal dominant? Does the observed
population (.464) differ from expected proportion
of 0.5? - Likelihood function for segregation ratio p is
-
4MAXIMUM LIKELIHOOD ESTIMATION
- Find the maximum value of the likelihood
function in the range 0 ? p ?1. In more difficult
problems, usual to maximize the log-likelihood,
since computationally more convenient, this
also maximizes the likelihood. - In this simple case, maximum likelihood estimate
(MLE) of p, r/n. - For the opalescent dentine data, ignoring the
constant term involving n, r which does not
vary as a function of p, log-likelihood function
is -
- ln L(p) 52 ln(p) 60 ln(1p)
5Figure 2.1 Log-likelihood function of the
segregation ratio for the opalescent dentine data
(from Sham, 1998)
p
6LIKELIHOOD-RATIO STATISTIC
- Likelihood ratio statistic twice the difference
between the log likelihood of the data at the MLE
(i.e. ), L1, and the log likelihood of the
data at the hypothesized value of 0.5, L0 2(ln
L1 ln L0). - For the segregation ratio example,
- 0.57 for the opalescent dentine example.
- Likelihood-ratio statistic in this case is
distributed as chi-square on one degree of
freedom, hence non-significant.
7MATRIX ALGEBRA BASICS
- A1 Inverse of A
- AT or A' Transpose of A
- A Determinant of A
- A ? B A postmultiplied by B
- r x c c x r conformable for multiplication
since number of columns of A number of
rows of B. Resulting matrix is r x r. - Tr (A) Trace of matrix A
8HISTORY (MX INTRODUCTORY WORKSHOP)
- Maximum-likelihood estimation using linear
covariance structure models, e.g. fitting models
to twin data - Let p be the number of observed variables, the
expected covariance matrix be E, and the expected
vector of means be ?, where E and ? are functions
of q free parameters to be estimated from the
data. Let x1, x2xn denote to observed variables.
Assuming that the observed variables follow a
multivariate normal distribution, the
log-likelihood of the observed data is given by - This is the formula used for maximum likelihood
model-fitting to raw continuous data, assuming a
multivariate normal distribution. (Often, ?2 ln
L, is estimated). Requires that we provide - (a) model for expected covariance matrix e.g.
in terms of additive genetic, shared and
non-shared environmental variance components
that will vary as a function of relationship - (b) model for expected means in simplest
applications we might estimate a separate mean
that might differ by gender, or possibly by twin
pair zygosity.
9EXAMPLE
- MZ Pairs
- E
- ? m , m T
- DZ Pairs
- E
- ? m , mT
- Where m is estimate of population mean, VA, VC,
VE and are additive genetic, shared environmental
and non-shared environmental variances, all
estimated jointly from the data. - Compare e.g. ln L1 for VA VC VE model
- ln L0 for VC VE model
- 2(ln L1 ln L0) distributed as chi-square on one
degree of freedom as before.
10HISTORY (MX INTRODUCTORY WORKSHOP) - II
- In practice most applications in the MX
Introductory Workshop fitted models to summary
covariance matrices. - We can simplify (e.g. Sham, p. 238)
- (1)
-
(2) - Where m is the vector of sample means of the p
observed variables, S is the observed covariance
matrix. For the simple applications in the
Introductory Workshop, we had no predictions for
the means structure, so we can saturate that
component of the model (i.e. estimate a separate
mean for every observed mean), equivalent to
deleting the term (m??)T E-1 (m??) in (2). Thus
the log-likelihood of the observed data becomes -
(3) - Under a saturated model, which equates every
element of E to the corresponding element of S
(i.e. a perfect fit model) we have for the
log-likelihood -
11HISTORY (MX INTRODUCTORY WORKSHOP) - II
- Thus the likelihood-radio test of the fitted
model against the saturated model becomes - (4)
- For multiple group problem, sum over groups.
-
12Analysis of Australian BMI data-young female MZ
twins pairs-MX DYI version
- Analysis of Australian BMI data-young female MZ
twins pairs-MX DIY version - !
- DA NG1 NI2 NO0
- begin matrices
- A LO 1 1 FR ! Additive genetic variance
- C LO 1 1 FR ! Shared environmental variance
- E LO 1 1 FR ! Non-shared environmental variance
- M FU 2 2 ! This will be observed MZ covariance
matrix - D FU 2 2 ! This will be observed DZ covariance
matrix - g fu 1 1 ! coefficient of 0.5 for DZ pairs
- n fu 1 1 ! sample size for MZ pairs (female
in this illustration) - k fu 1 1 ! sample size for DZ pairs (female
in this illustration) - p fu 1 1 ! order of matrices (i.e. number of
variables 2 in this case) - end matrices
- mat g 0.5
- mat p 2
- mat m
- 0.7247 0.5891
- 0.5891 0.7915
13Analysis of Australian BMI data-young female MZ
twins pairs-MX DYI version (ctd)
- BEGIN ALGEBRA
- tnk ! total sample size
- U(ACE AC _ AC ACE) ! Expected MZ
covariance matrix - V(ACE gAC _ gAC ACE) ! Expected DZ
covariance matrix - Hn(\ln (\det(U))-\ln (\det(M)) \tr((U
M))-p) ! fit-function for MZ group - Jk(\ln (\det(V))-\ln (\det(D)) \tr((V
D))-p) ! fit-function for DZ group - Fhj
- END ALGEBRA
- bo 0.01 1.0 e(1,1)
- bo 0.0 1.0 c(1,1) a(1,1)
- CO F
- option user df6
- end
14PRE-HISTORY (STATISTICS 101) - II
- LINEAR REGRESSION requires weaker assumptions
than linear covariance structure models. Does not
assume multivariate normal distribution, only
homoscedastic residuals. Flexible for handling
selective sampling schemes where we oversample
extreme values of predictor variable(s). - We can fit linear regression models by maximum
likelihood e.g. using MX.
15HISTORY! MX INTRODUCTORY WORKSHOP - III
- Definition variables an option in MX when
fitting to raw data, which allows us to model
effects of some variables as fixed effects,
modeling their contribution to expected means. - Simple example controlling for linear or
polynomial regression of a quantitative measure
on age. Dont want to model covariance structure
with age (which probably has rectangular
distribution!) - Definition variables variables whose values may
vary from individual to individual, that can be
read into matrices - Important example genotypes at a given locus or
set of loci.