Title: Fitting models to data
1Fitting models to data I(Sum of Squares)
2What do we mean by Fitting to Data and why do
it?
- Fitting to data provides the basis for
- setting the values for the parameters of a model
and hence computing the values for the state
variables. - evaluating whether a model can mimic the existing
data adequately (if it cant, perhaps we should
eliminate it). - comparing different hypotheses (represented by
the models that fit the data adequately - to some
extent). - assessing the amount of uncertainty.
3Keep in Mind
- Models the parameters, state variables and
forcing functions. - Data information we want to use to specify the
values for the parameters.
4Fitting to data a generic approach
- We define a function, say, that measures
the difference between the data we observed and
what the model says we should have observed. This
function measures the goodness of fit of the
model to the data. - We select the values for the parameters so that
the difference is as small as possible (i.e. the
parameters that allow the model to mimic the data
best). - Fitting models therefore involves selecting the
function and then minimizing it.
5An Example The Bowhead Census Data-I
- The model we want to fit to these data is
This is the exponential growth model it has two
parameters - P1978 and r.
How do we choose the values for these parameters?
6An Example The Bowhead Census Data-II
- Answer We select the values to minimize the sums
of squared differences function - SSQ is a function of P1978 and r. We can find the
values that minimize SSQ using several techniques
(coming soon).
7An Example The Bowhead Census Data-II
The sum of squares surface
(0.08,6000)
Contours of equal values of SSQ
N1979
The best fit values (N19784877 r0.0326)
Slope -r
(0,0)
8An Example The Bowhead Census Data-III
- We can define the differences after
log-transformation (weights relative differences
equally) - The best fit values for N1978 and r are now 4717
and 0.0359. These differ slightly from those
obtained before transformations of the data can
impact the results. - Note that for this case, we actually did a linear
regression
9An Example The Bowhead Census Data-IV
- To fit the logistic growth model, we just replace
the model used to calculate with the
logistic model. - Transformations
- None equal weight to absolute differences
- Log equal weight to relative differences (i.e. 1
vs 2 weighted equally as 100 vs 200). - We often assume a log-transformation because the
scale of the data is usually arbitrary.
10Computational Methods
- Direct search (using a sum of squares surface
for 1-3 parameters). - Analytic methods (differentiate SSQ with respect
to the parameters and solve the resultant
equation try this for a linear regression). - Non-linear optimization methods.
11Fitting the Dynamic Schaefer Modelto Cape Hake
data
- Model assumptions
- Dynamics deterministic and governed by the
(discrete) logistic equation. No immigration,
emigration, etc. - The initial biomass (1917) equaled the carrying
capacity B0 (K). - Catch-rate (CPUE) is linearly proportional to
mid-year biomass. - The catch rates are log-transformed before being
included in the SSQ. - Note These assumptions formed the basis for the
actual assessments for this stock in the early
1980s!
12The Equations for this Model
Which are the state variables, parameters,
forcing functions and data?
This example is already quite complicated we
cant use a direct search method so we used the
EXCEL Solver function which implements a
non-linear optimization algorithm.
13(No Transcript)
14The Fit to the Cape Hake CPUE Data
r0.316 K1652 q0.0121 B93/K0.44
15Adding Auxiliary Information
- There are survey data for Cape Hake these data
provide an alternative index of abundance. How to
deal with this information - Run the assessment using each data source in
turn - Combine the two sources of data into one SSQ
the SSQ contributions have to be weighted
16Sensitivity to the Weight
17An Interpretation
- Adding new information should have improved our
understanding of the situation it didnt. - Clearly the two types are data are contradictory
they are giving different signals. - How to select the weights becomes of considerable
importance. Often it is good to fit models for
each data type in turn (w0 w1 in this case).
18Diagnostics
This is a case when the sum of squares surface is
very complicated and unhelpful
- Check for patterns in the residuals.
- Look for influential data points / outliers.
- Examine sensitivity to changing the data.
19Review of Model Fitting-I
- Identify the questions.
- Identify the data and the alternative hypotheses.
- Build a set of alternative models.
- Select transformations and weightings and build
the sum of squares function. - Fit the models to the data.
- Check the diagnostics and reject unacceptable
models.
20Review of Model Fitting - II
- Sum of squares allows us to estimate model
parameters - How to quantify uncertainty?
- How to compare models that fit the data
adequately? - We need Maximum Likelihood methods.
BUT
21Readings
- Hilborn and Mangel (1997), Chapter 7
- Haddon (2001), Chapter 3