Modelling data and curve fitting - PowerPoint PPT Presentation

1 / 24

About This Presentation

Title:

Modelling data and curve fitting

Description:

Least squares straight line fit, and interpreting the measuring chi square. Non-linear fit using a simple search for the minimum chi square ... – PowerPoint PPT presentation

Number of Views:40

Avg rating:3.0/5.0

Slides: 25

Provided by: DrSt5

Category:

more less

Transcript and Presenter's Notes

Title: Modelling data and curve fitting

1
Modelling data and curve fitting

Least squares
Maximum likelihood
Chi squared
Confidence limits
General linear fits

(Chapter 15, Numerical recipes. Press et al )
2
Best fit straight line

Assume we measure a parameter y for a set of x
values, giving a set of data xi and yi
We want to model the data using a linear relation

y(xi) a b xi
3
Best fit straight line

How do we find the coefficients a and b that give
the best fit to the data?
Given a pair of values for a and b, we need to
define a measure of the goodness of fit.
Then choose the a and b values that give the best
fit.

4
Least squares fit

For each data point, xi, calculate the difference
between measured yi and the model prediction,
abxi
Note, ?yi can be positive or negative, so S?yi
can be zero.
Minimizing the sum of the squared residuals will
give a good overall fit
Computationally, try a range of values for a and
b, and for each pair calculate
The pair which gives the smallest S is the best
fit

?yi yi a bxi
SS(?yi2)
5
Maximum likelihood

It can be shown that the parameters that minimize
the sum of the squares are the most likely, given
the measured data
Assume the x values are exact, and the
measurement errors on the y values are Gaussian,
with mean zero, and deviation s. So
Where ei is a random variable taken from a
Gaussian distribution

yi ytrue(xi) ei
6
Example Gaussian distribution
7

If the true values of a and b are a0 and b0 then
So the probability of observing yi is
(assuming s is the same for all measurements)

And the probability of observing the whole
dataset yi is
We can use Bayes theorem to relate this to how
likely it is that the model parameters are a and
b

9
Bayes theorem

Given two events A and B, then the conditional
probabilities are related
P(AB) P(B) P(BA)P(A)
P(AB) is the probability of A happening, given
that B has happened
P(A) is the probability of A happening,
independent of B

10
Application of Bayes theorem

Consider a model M and some data D. Then Bayes
theorem tells you the probability that the model
is right, given the data that you have observed
So the probability of a particular model, given
the data, depends on the probability of observing
your data given the model
The most probable model is the one for which the
observed data is most likely
Vary a and b to find the maximum P(M(a,b)D),
which is the same P(a0,b0) defined earlier

P(MD) P(DM)P(M)/P(D)
11

Maximizing
means minimizing
So for uniform Gaussian errors, maximum
likelihood is the same as least squares

12
Non-Gaussian errors

Sometimes you know errors are not Gaussian, so
least squares may not be the best method.
Minimizing the sum of the modulus is very robust
It is equivalent to using the median instead of
the mean
In general use M-estimates maximum-likelihood
based on non-Gaussian error distribution

13
(Chi squared)

If the uncertainty is different for each
measurement then define a quantity
If the errors are Gaussian, then minimizing
will give the maximum likelihood values of the
parameters.

14
Example of minimum
15
Finding minimum of (numerically)

Calculate S(?yi2) for a grid of a and b values
and pick the point that is the minimum

16
Finding minimum of (analytically)

Analytically differentiate with respect to a
and b and set
and
Leads to

17
Confidence interval

The distribution of has a chi-square
distribution with N-M degrees of freedom.
The distribution of has a
chi-square distribution with M degrees of freedom
(for M parameters).
The probability of a given value of A being the
true value is given by the probability of getting
the observed for that value.
When this corresponds to
68 ie 1s

18
(No Transcript)
19
The value of
The value of tells you more about the
model and the data If is greater than
the number of degrees of freedom either the real
errors are greater than the that you used,
or the model is not good. If is less
than the number of degrees of freedom either the
real errors are smaller than the that you
used, or the model has too many parameters.
20
General linear models