Title: Fish 559; Lecture 14
1Random Number Generators
2Background
- Many statistical modeling techniques
(bootstrapping, numerically sampling from a
posterior distribution, simulation testing)
involve Monte Carlo methods. - These methods rely on being able to sample random
numbers from pre-specified probability
distributions. - There is no way to use a computer to generate
truly random numbers (all the algorithms are
deterministic to some extent) but some ways of
generating random numbers are more random
than others.
3Generating Random Numbers(An overview of methods)
- Generating uniform deviates
- Transformation methods
- Rejection methods
4Generating Uniform Deviates-I
- Most of the methods we will discuss in this
lecture depend in some way on a random number
generated from U0,1. - Most computer languages (including EXCEL and VB)
have built-in routines for generating random
numbers from U0,1. However, these are often
quick and dirty and should generally not be
used. - The more quick and dirty an algorithm for
generating uniform random numbers is, the shorter
its period (the time it takes for the random
numbers to cycle) is likely to be.
5Generating Uniform Deviates-II
- Random number generators can (and do) cycle. When
generating a large number of random variates, you
need to check that the random numbers are not
cycling. - I recommend that, whenever possible, portable
random number generators (i.e. generators that
are platform independent) be used. The routine I
use most frequently is RAN3 (see page 199 of
Numerical Recipes). - In Splus / R, the function runif(x, min, max)
provides a routine that is believed to be
trustworthy. The generator is initialized by
set.seed(i) where i lies between 0 and 1023. The
period for this generator is claimed to be 4.6
1018 sufficient for most of our needs!
Certainly a lot better than Rand() in EXCEL!
6Transformation Methods-I
- Let us assume we wish to generate a random
number, x, from some distribution f(x). - Now, let us assume that we know the cumulative
distribution for x, F(x). - We can generate values of x as follows
- Generate a set of uniform variates, z, from
U0,1. - Compute xF -1(z).
7Transformation Methods-II
- To generate from the distribution
- The cumulative distribution in this case is
easily shown to be - The inverse function to F is therefore
8Transformation Methods-III
The true distribution and its cumulative function
Sample of 100,000 generated using the
transformation method
9Transformation Methods-IV
- This method is extremely powerful because it
generates a value of x for each value of z. - However, it relies on being able to compute F(x)
which is, in general, difficult (or impossible).
This is particularly a problem for multivariate
probability distributions.
10Transformation Methods-V
- An approximation to this method when the range
for x is bounded is - tabulate f(x) for a large number, N, of values of
x and form the cumulative distribution
F(xi)F(xi-1)f(xi). - generate a value for x by selecting a value, z,
from U0,F(xN) and finding the value of xi so
that F(xi)?zgtF(xi-1).
11Transformation Methods-VI
Suppose the generated value of z was 0.23, we
solve for x1
Approximation to F(x) based on 100 points
12Rejection Methods-I
- This is a very powerful technique. The only
requirement is that f(x) be computable up to an
arbitrary scaling constant (i.e. - there is no requirement that F(x) be known and
- there is no requirement that the constant of
integration be known) - These methods can be applied straightforwardly to
multivariate problems. - However, they can be much more intensive
computationally than transformation methods.
13Rejection Methods-II
- In order to generate a random variate x from
f(x) - Find a function g(x) such that g(x)?f(x) for all
x and is easy to generate from. - Generate x from g(x) and z from U0,1.
- Compute f(x)/g(x) and accept x if f(x)/g(x)
exceeds z otherwise return to step 2.
14Rejection Methods-III
- Notes
- The rejection method is very similar to the
Sample-Importance-Resample algorithm used to
generate parameter vectors from a Bayesian
posterior distribution. - The efficiency of these methods depends on how
well g(x) approximates f(x). It is a good idea to
keep track of the acceptance probability. If this
is unacceptably low, it may be worth trying to
find a better choice for g(x).
15Rejection Methods IV
- We wish to generate from the distribution
- We define g(x) as
16Rejection Methods-V
The true distribution
Sample of 10,000 generated using the rejection
method
A total of 31,501 values for x were generated to
obtain these 10,000 values.
17The Sample-Importance-Resample (SIR) Method
- The algorithm proceeds as follows to generate n
random variables from f(x) - Generate x from g(x) and compute p(x)f(x)/g(x)
- Repeat step 1 many (m) times.
- Select n values of x (with replacement) from
those generated with a probability of selecting
xi proportional to p(xi). - The number of times (out of n) each value of xi
is selected should be monitored. If maximum
number of times any xi is selected is too high,
m should be increased or a better approximation
to f(x) developed.
18SIR-II
- SIR is used for stock-assessments where g(x) is
often taken to be a multivariate normal (or t)
distribution centred at the MLE with
variance-covariance matrix equal to that
evaluated at the MLE. - The code to implement SIR in Splus is remarkably
simple
Nsim lt- 30000 xx lt- runif(Nsim,0,pi) yy lt-
sin(xx) val2 lt- sample(xx,size10000,probyy,repla
ceT)
19SIR-III
The true distribution
Sample of 10,000 generated using the SIR
algorithm with m30,000