Title: Stochastic Methods 2
1Stochastic Methods 2
Using random numbers to solve problems
Sample variance estimation Monte-Carlo
integration General idea Importance
sampling Stratified sampling Optimization
problems Simulated annealing Metropolis
algorithm
2Sample variance
Estimating uncertainties on data
Use random numbers to mimic measurement errors
particularly useful for when you know the error
distribution Create a large number of fake
data-sets Measure the parameters of interest
for each of the mock data-sets. The scatter
between the mock parameters gives your estimate
of the uncertainties on the parameters derived
from the true data Eg estimating cosmological
parameters
3Bootstrap methods
Assume you have a data-set of N measurements.
Create random versions of the data-set by
choosing points at random from the original
data. Measure the parameter p of interest for
each random version, and measure the variance
between the different realizations. Bootstrap
make each realization have N data-points but draw
from the true data with replacement, so that some
points are duplicated. The uncertainty on the
parameter derived from the original data is the
variance between the mock parameters.
4Jackknife
Reject each point in turn to give N smaller
data sets. The uncertainty on the parameter
derived from the original data is
Where is the value from the full data set.
5d-delete Jackknife
Reject d points to give smaller data
sets.
The uncertainty on the parameter derived from
the original data is
In practice is very large, so random
sample n possible samples
6Monte Carlo integration
7Crude M-C integration
The integral is the area under the curve Put
points down at random, and count the number under
the curve (just like rejection method for
generating random numbers)
H
D
(points under the curve)
Very inefficient need 2 random numbers per
point Might be useful if you dont have a
simple expression for the curve.
8A simple example
To calculate a correlation function need to
calculate the area of a circular annulus with
complicated boundaries
Easiest way is to randomly distribute points over
the area, and count which ones are in the annulus
9Sensible M-C method
If you know f(x) use it directly! In
Quadrature methods, f(x) is sampled at regular
values with separation dx. Then the integral is
Instead we can sample f(x) with a uniform
random deviate over the range of the integral
Then the mean spacing is dx ?/N
and the integral is
10Monte-Carlo integration
In general an integral can be represented as
the mean value times the volume (or in 2D area)
The range of the integral does not have to be
completely covered by uniform steps you can
sample it randomly Particularly useful for
multidimensional integrals with complicated
boundaries, or when you dont need very high
precision
11Importance sampling
Can get better accuracy if you sample f(x) more
densely where f(x) is large Set the sampling
rate by a function p(x), chosen so that f(x)/p(x)
constant Need to down-weight the parts that
you over-sample So
12 If we make p(x) a probability density function,
so then,
The range of the integral does not have to be
completely covered by uniform steps you can
sample it randomly Particularly useful for
multidimensional integrals with complicated
boundaries, or when you dont need very high
precision
13Stratified sampling
If f(x) is constant it takes only one point to
determine
Chose to concentrate points where f(x) is
changing rapidly Divide ? into subregions
vary the sampling rate so that the number of
points in each region is proportional to the rms
of f(x) in that region, s(f) Again, you need to
down-weight the parts that you over-sample
So
14Comparison to Quadrature methods For Gaussian
quadrature in d dimensions, if you use N
points, each range is split into N1/d
intervals the spacing will be dN-1/d the
error will decrease roughly like N-2/d For MC
method error will decrease roughly like N-1/2
If d 4, or greater, the MC method will give
smaller errors for given N
15Monte Carlo Optimization
16Optimization Minimization/maximization of a
function. Could simply step through the allowed
parameter range but can be time-consuming,
particularly if the parameter set has many
dimensions For example with 10 steps in each of
four parameters you would need 10000 function
evaluations Similar to Monte Carlo integration,
you can random sample
17Simulated annealing Good when the global
extremum is hidden by poorer local extrema
Physical analogy is slow cooling of liquids which
finds the lowest energy ie the crystal state
The behaviour depends on the Boltzmann
distribution
which means that there is a probability of
being in a high E state, even at low T This
allows the system to get out of a local minimum
and find a better more global one
18Metropolis algorithm Consider a system in state
with energy E1 with a range of possible
alternative states Pick one of the states at
random, calculate the energy E2 Accept this as
the new state with probability
Note if E2ltE1 set p1 Repeat for some chosen
number of iterations Set a lower the value of T
and repeat Continue until your tolerance is
reached
19General simulated annealing
Calculate the function f(x) at a point x Move
from x to x ? x Accept the move if f(x)
decreases, and sometimes accept if f(x)
increases Use a parameter T to control
the acceptance of any step ? x Slowly reduce T
20First set of assessed work The assignments are
designed to test both an understanding of physics
and the ability to implement a suitable numerical
method to solve the problem. Most marks are for
the programs themselves, including choice of
algorithm efficient implementation well
structured programs sensible choices for
variable names clear comments and layout to
make it readable As well as writing the
programs to answer the questions, you should
write a brief report (no more than 3 pages) to
present your answers, describe the programs and
show their output. email your programs and
report to walter.kockenberger_at_nottingham.ac.uk by
4pm Monday You can also hand in printouts to
Lesley Martin (MR Centre reception)