Title: Fitting models to data
1Fitting models to data II(The Basics of
Maximum Likelihood Estimation)
2The Principle of ML Estimation
- We wish to select the values for the parameters
so that the probability that the model generated
(is responsible for) the data is a high as
possible. - Taken another way if we have two candidate sets
of parameters and the probability that one
generated the data is ten times the other, we
would naturally prefer the former. - OK, so how to we define this probability.
3The Likelihood Function
- What we need to compute is the likelihood
function - If we have a discrete set of hypotheses / set of
parameter vectors, then
4A First Example
- We observe Y6 and know that the observation
process is based on the equation - Given Y6, the likelihood function is normal
5A First Example - II
Y4
Y6
Note the parameter and not the data we are
given the data
6Multiple Data Sources
- If we have multiple data sources (CPUE and survey
data for Cape Hake), we can establish a
likelihood for each data source. The likelihood
for the two data sources combined is the product
of the likelihoods for each data source - Note We often work with the logarithm of the
likelihood function, i.e.
7Likelihood Estimation
- Identify the questions.
- Identity the data sources.
- Select alternative models.
- Select appropriate likelihood functions for each
data source. - Find the values for the parameters that maximize
the likelihood function (hence Maximum Likelihood
Estimation).
8Finding the Maximum Likelihood Estimates
The best estimate is 6, because this value of ?
leads to the maximum likelihood
9Therefore.
We need to know which probability density
functions to use for which data types.
- The probability distributions encountered most
commonly are - Normal / multivariate normal
- t
- Log-normal
- Poisson
- Negative binomial
- Beta
- Binomial / multinomial
You need to know when to use each distribution
and its functional form (up to any normalizing
constants).
10The Normal and t-distributions
- The density functions for the normal and
t-distributions are - ? is the mean
- ? is the standard deviation ( for the t)
- k is the degrees of freedom.
- We use these distributions when the data are the
sum of terms. The t-distribution allows account
to be taken of small sample sizes (?lt30).
11The Normal and t-distributions
12Key Point with Normal Likelihood
Let us say we wish to fit the model
assuming normally distributed errors, i.e.
The likelihood function is therefore
Taking logarithms and multiplying by -1 gives
This is implies that if you assume
normally-distributed errors, the answers will be
identical to those from least squares.
13Time for an Example!
- We wish to fit the Dynamic Schaefer model to the
bowhead census data. - q is assumed to be 1 here because the surveys
provide absolute indices of abundance. - We have information on the trend in abundance
from 1978-93 (increase of 3.2 per annum (SD
0.76) based on 8 data points). - We have an estimate of abundance for 1993 of 7800
(SD 564).
14How to Deal with this Example!
- The model
- The likelihood function is the product of a
normal likelihood (for the abundance estimate)
and a t-likelihood (for the trend). Ignoring
constants independent of the model parameters - We take logs, multiply by minus one and minimize
to find the estimates for K and r. - Note that we can ignore any constants why?
- The t-distribution is chosen for the slope why?
15The Outcome
B19937710 Slope78-932.95
16The Lognormal distribution
- The density function
- ? is the median (not the mean)
- ? is the standard deviation of the logarithm
(approximately the coefficient of variation of
x). - The lognormal distribution is used extensively in
fisheries assessments because x is always larger
than zero this is true for most data sources
(CPUE, survey indices, estimates of death rates,
etc.)
17The Multivariate Normal-I
- The density function
- is the vector of means.
- is the variance-covariance matrix.
- d is the length of the vector.
- This isnt nearly as bad as it looks.
18The Multivariate Normal-II
- We use the multivariate normal when the data
points are correlated (e.g. surveys with common
correction factors). For example for bowheads
19Readings
- Hilborn and Mangel (1997) Chapter 7
- Haddon (2001), Chapter 4