Title: Fitting models to data
1Fitting models to data IV(Yet more on Maximum
Likelihood Estimation)
2The Poisson Distribution
- The density function
- rt is the expected number of events (r is a rate
and t is time). - k is the number of discrete events (count data).
- The Poisson distribution has only one parameter
(rt) which is both the mean and the variance.
However, often we find the variance is larger
than would be expected under the Poisson model so
assume this model with care better still, look
at the negative binomial distribution first!
3Poisson Model Example-I
100 longline sets are observed and the following
data are collected. What is the rate (numbers
per set) at which seabird are captured?
4Poisson Model Example-II
- The log-likelihood function (after removal of
constants) is given by - This equation is maximized at r0.69.
- How else could we have obtained the same estimate
for r?
5The Negative Binomial Distribution-I
- The negative binomial distribution extends the
Poisson distribution by allowing the rate
parameter to be a (gamma) random variable - R is the expected number of observations
(discrete or continuous) - k is an overdispersion parameter.
- Note
6The Negative Binomial Distribution-II
- The mean of the negative binomial distribution is
R. - The variance of the negative binomial
distribution is - The negative binomial distribution collapses to
the Poisson distribution as
7The Negative Binomial Distribution-III
8The Negative Binomial Distribution-IV
- Consider the case in which we monitor the catch
of a given species (in number) as a function of
fishing effort. If the catch occurs randomly per
unit time we would expect the catch to be Poisson
distributed with mean (and variance) equal to the
product of the fishing duration and a rate of
capture. - For this problem, we apply the Poisson model and
the Negative binomial model.
9The Negative Binomial Distribution-V
k is 84 for this fit good evidence that the
Poisson model is adequate.
10The Negative Binomial Distribution-VI
The data are now (really) overdispersed relative
to a Poisson distribution. The estimates are
again identical, but the negative binomial
indicates lesser precision than the Poisson.
11Overdispersion
- Overdisperson implies that the variance of the
data is greater than that expected under the
distribution assumed (e.g. Poisson ?
variancemean). - If the data are overdispersed but this is
ignored, you are overweighting the data (i.e.
underestimating their uncertainty).
12Likelihood Cheat sheet
Data?
Continuous
Discrete
Number of outcomes
Can be negative?
2
Yes
Many
No
Normal / t
Binomial
lognormal / gamma
Poisson / Negative binomial / Multinomial
13Fitting Miscellany-II
- Robustness
- In many cases, the assumptions underlying the
likelihood function are wrong some data points
are too unlikely. Such data points are outliers. - Outliers can either be left out of the analysis
or the likelihood robustified to reduce their
influence. - Robustification includes minimizing the median
residual, leaving out the largest residuals,
downweighting large residuals.
14Fitting Miscellany-III
- Contradictory data
- All probability statements are based on the
assumptions of the model and likelihood function,
and these may be wrong! - Often when we have two (or more) data sources
they disagree! - The problem is that (at least) one data source is
not measuring what we think it is. - Solutions
- Include some probability that each index doesnt
tell us anything and - Run separate assessments for each index in turn.
15Contradictory data(northern cod)
The northern cod dilemma two abundance indices
one increasing (and relatively precise), the
other not (and noisy). To pool or not to pool!
16Additional Readings
- Chen, Y. Fournier, D. 1999. Can. J. Fish. Aquat.
Sci. 56 1525 1533. - Fournier, D.A. Hampton, J. Sibert, J.R. 1998.
Can. J. Fish. Aquat. Sci. 55 21052116. - Schnute, J.T. Hilborn, R. 1993. Can. J. Fish.
Aquat. Sci. 50 1916 1927.