Fitting models to data - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Fitting models to data

Description:

to Cape Hake data. Model assumptions: ... There are survey data for Cape Hake; these data provide an alternative index of abundance. ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 22
Provided by: PaulB136
Category:
Tags: data | fitting | hake | models

less

Transcript and Presenter's Notes

Title: Fitting models to data


1
Fitting models to data I(Sum of Squares)
  • Fish 458, Lecture 7

2
What do we mean by Fitting to Data and why do
it?
  • Fitting to data provides the basis for
  • setting the values for the parameters of a model
    and hence computing the values for the state
    variables.
  • evaluating whether a model can mimic the existing
    data adequately (if it cant, perhaps we should
    eliminate it).
  • comparing different hypotheses (represented by
    the models that fit the data adequately - to some
    extent).
  • assessing the amount of uncertainty.

3
Keep in Mind
  • Models the parameters, state variables and
    forcing functions.
  • Data information we want to use to specify the
    values for the parameters.

4
Fitting to data a generic approach
  • We define a function, say, that measures
    the difference between the data we observed and
    what the model says we should have observed. This
    function measures the goodness of fit of the
    model to the data.
  • We select the values for the parameters so that
    the difference is as small as possible (i.e. the
    parameters that allow the model to mimic the data
    best).
  • Fitting models therefore involves selecting the
    function and then minimizing it.

5
An Example The Bowhead Census Data-I
  • The model we want to fit to these data is

This is the exponential growth model it has two
parameters - P1978 and r.
How do we choose the values for these parameters?
6
An Example The Bowhead Census Data-II
  • Answer We select the values to minimize the sums
    of squared differences function
  • SSQ is a function of P1978 and r. We can find the
    values that minimize SSQ using several techniques
    (coming soon).

7
An Example The Bowhead Census Data-II
The sum of squares surface
(0.08,6000)
Contours of equal values of SSQ
N1979
The best fit values (N19784877 r0.0326)
Slope -r
(0,0)
8
An Example The Bowhead Census Data-III
  • We can define the differences after
    log-transformation (weights relative differences
    equally)
  • The best fit values for N1978 and r are now 4717
    and 0.0359. These differ slightly from those
    obtained before transformations of the data can
    impact the results.
  • Note that for this case, we actually did a linear
    regression

9
An Example The Bowhead Census Data-IV
  • To fit the logistic growth model, we just replace
    the model used to calculate with the
    logistic model.
  • Transformations
  • None equal weight to absolute differences
  • Log equal weight to relative differences (i.e. 1
    vs 2 weighted equally as 100 vs 200).
  • We often assume a log-transformation because the
    scale of the data is usually arbitrary.

10
Computational Methods
  • Direct search (using a sum of squares surface
    for 1-3 parameters).
  • Analytic methods (differentiate SSQ with respect
    to the parameters and solve the resultant
    equation try this for a linear regression).
  • Non-linear optimization methods.

11
Fitting the Dynamic Schaefer Modelto Cape Hake
data
  • Model assumptions
  • Dynamics deterministic and governed by the
    (discrete) logistic equation. No immigration,
    emigration, etc.
  • The initial biomass (1917) equaled the carrying
    capacity B0 (K).
  • Catch-rate (CPUE) is linearly proportional to
    mid-year biomass.
  • The catch rates are log-transformed before being
    included in the SSQ.
  • Note These assumptions formed the basis for the
    actual assessments for this stock in the early
    1980s!

12
The Equations for this Model
Which are the state variables, parameters,
forcing functions and data?
This example is already quite complicated we
cant use a direct search method so we used the
EXCEL Solver function which implements a
non-linear optimization algorithm.
13
(No Transcript)
14
The Fit to the Cape Hake CPUE Data
r0.316 K1652 q0.0121 B93/K0.44
15
Adding Auxiliary Information
  • There are survey data for Cape Hake these data
    provide an alternative index of abundance. How to
    deal with this information
  • Run the assessment using each data source in
    turn
  • Combine the two sources of data into one SSQ
    the SSQ contributions have to be weighted

16
Sensitivity to the Weight
17
An Interpretation
  • Adding new information should have improved our
    understanding of the situation it didnt.
  • Clearly the two types are data are contradictory
    they are giving different signals.
  • How to select the weights becomes of considerable
    importance. Often it is good to fit models for
    each data type in turn (w0 w1 in this case).

18
Diagnostics
This is a case when the sum of squares surface is
very complicated and unhelpful
  1. Check for patterns in the residuals.
  2. Look for influential data points / outliers.
  3. Examine sensitivity to changing the data.

19
Review of Model Fitting-I
  • Identify the questions.
  • Identify the data and the alternative hypotheses.
  • Build a set of alternative models.
  • Select transformations and weightings and build
    the sum of squares function.
  • Fit the models to the data.
  • Check the diagnostics and reject unacceptable
    models.

20
Review of Model Fitting - II
  • Sum of squares allows us to estimate model
    parameters
  • How to quantify uncertainty?
  • How to compare models that fit the data
    adequately?
  • We need Maximum Likelihood methods.

BUT
21
Readings
  • Hilborn and Mangel (1997), Chapter 7
  • Haddon (2001), Chapter 3
Write a Comment
User Comments (0)
About PowerShow.com