Title: Models Stochastic Models
1Models - Stochastic Models
- Chapter 3 Simulation
- Shane Whelan
- L527
2Introduction
- A random variate a realisation of a random
variable, generally denoted with lower case
letters. - A random number or, more precisely, a uniform
random number is a random variate from U(0,1).
More pedantically, a pseudo- or quasi-random
number if the means we use to get the numbers is
not totally random. - In the good old days rotating discs, shuffling
cards, dice, etc. - In 1927, Tippett publishes first table of random
numbers. - e.g. Von Neumanns (1951) mid-square method take
square of previous number and take middle digits.
3Pseudo-random numbers are more practical than
real random numbers
- Need numbers in computer if finite file
generated from truly random source then could run
out. - Need to be reproducible to replicate a study.
- Expensive to fit device to computer to generate
truly random and then store them on machine (1
million long integers 4 MB of memory). - Fully tested algorithm, using little memory, no
cost, and easily reproducible is ideal.
4Criteria for good random number generator
- the numbers generated appear uniformly
distributed - the numbers generated appear statistically
independent - the numbers generated are easily reproducible (so
one can replicate a study) - the sequence is fast to compute
- the algorithm requires minimum storage capacity
- the algorithm produces a large number in the
sequence
5Mid-square method by von Neumann (1951)
- Method
- Start with an integer of n or n1 digits long
- Square it and take the middle n or n1 digits of
the result. Report the n or n1 digits as random.
- Go to the start.
- Example
- Let us start with number 456. Then, applying the
mid-square algorithm, we get - 0 7 9 3 2 8 8 4 1 7 4 2 7 7 2 9
6Linear Congruential Generators (LCGs)
- Calculates a sequence of numbers,
- x0,x1,,xn,
- Xn1axnc (mod m), n?0.
- x0 is seed
- a is multiplier
- c is increment
- m is modulus,
- a, c, m, x0 ??
- If used to generate U(0,1) then unxn/m.
7Example of LCG
- Calculate the first five values of the LCG
- Xn15xn7 (mod 256)
- given x00.
- We get 0, 7, 42, 217, 68, 91.
8LCGs
- Clearly deterministic, periodic sequences.
- Clearly maximum period is m (known as full period
LCG in this case). - If c?0 then mixed LCG.
- If c0 then multiplicative or pure.
- Much is known about LCGs and the choice of a, c,
m, to ensure that the sequence appears random. - Their simplicity, calculation speed and ease of
use, coupled with our understanding of some of
their properties, make them an obvious choice to
generate random numbers.
9But do they generate random numbers?
- We have to establish that the generated sequence
of numbers possess two properties - the entries in the sequence appear to be
distributed U(0,1) - the entries in the sequence appear to be
independent. - (1) Some statistical tests to test the
distributional form of the ltungt - Chi-Square Goodness-of-Fit Test
- Kolmogorov-Smirnov Goodness-of-Fit Test
- Cramér-von Mises Goodness-of-Fit Test
- (2) Some statistical tests to test the
independence of entries in ltungt - Run Test
- Rank-sum Test
- Serial Test
10Plot of un against n from LCG in Earlier Example
11Plot of un1 against un in Earlier Example
12Your Portable LC Random Number Generator
- The IBM System/360 Uniform Random Number
Generator. (First proposed in 1969). - Xn 75xn-1 (mod 231-1)
- i.e., Xn16,807xn-1(mod 2,147,483,647)
- Period is 231-2.
- Performs satisfactorily in statistical tests.
- This LCG is probably the optimum one for 32 bit
processors - can ensure no rounding error in
32-bit processors.
13Other Random No. Generators
- Marsaglias mother-of-all-pseudo random number
generators at - http//www.stat.fsu.edu/pub/diehard
- Mersenne twister of Matsumoto and Nishimura
which boasts a period of 219,937-1 together with
very fast generation at - http//www.math.sci.hiroshima-u.ac.jp/m-mat/MT/em
t.html - Good General site is www.random.mat.sbg.ac.at.
14Generation of random variates from a given
distribution, given uis
- Inverse Transform Method
- Acceptance-Rejection Method
- Special algorithms,
- e.g., to generate normal variates
- Box-Muller
- Polar
- Standard Reference Source
- Numerical Recipes in C, Press, Teukolsky,
Vetterling, Flannery, Cambridge University Press. -
15Inverse Transform Method
- Let F(x) be the distribution from which we want
the random variate. - Say we have random variates from a standard
uniformfrom our random number generator. - The Inverse Transform method relies on the
insight that distribution functions are
non-decreasing functions, so we can define its
inverse function suitably for our purposes -
16Cumulative Distribution Function, F(x)
Continuous Case
17Cumulative Distribution Function, F(x)
Continuous Case
18Cumulative Distribution Function, F(x) Discrete
Case
19Cumulative Distribution Function, F(x) Discrete
Case
20Defining F-1(u)
- To ensure F-1(u) is well-defined and
single-valued (i.e., a function) we define it as
21Inverse Transform Method
- So,
- Get u, a random variate from U(0,1)
- Calculate F-1(u), which gives a random variate
from F(x), as required.
22Important Special Case Discrete RVs
- Let X be a discrete RV, taking finite no. of
values, x1, x2,,xN. - Order them into x(1), x(2),, x(N)
- s.t. x(i)ltx(j) iff iltj
- Let PXx(i)pi.
- Then F(x)PX ? x?xi?xpi
- So,
- Generate ui from U(0,1)
- Find j s.t, F(x(j-1))ltui?F(x(j))
- Then give x(j) the required random variate of
X. -
23Examples
- Given uin??, simulate random variates from
- (a) Exponential,
- (b) Weibull,
24Theorem Inverse Transform Method
- Theorem Let U be a random variable with
standard uniform distribution and let X be a
random variable with distribution function Fx(x).
Then the random variable Y, defined as - where
-
- has distribution function Fx(x).
- Proof On Board (and in Notes)
25Acceptance-Rejection Method
- This method is based on an identification of
probability with area
26Estimating Areas
27Estimating Areas
- One intuitive way of approaching the problem is
to - embed the shape in another shape whose area we
can easily compute say a circle. - estimate the ratio of the areas as Area of
Irregular Shape/Area of Circle, and denote the
ratio r . - then, an estimate of the Area of the Irregular
Shape r.Area of Circle - This is the idea behind the acceptance-rejection
method.
28Acceptance-Rejection Method
- Consider a density function we desire a random
variate from, say, denote it f(x). - Find another density, h(x), which is close to
f(x) for most x but simplier to generate random
variates from (e.g., using the inverse transform
method) - Let Cmax f(x)/h(x).
- Consider Ch(x) which lies above f(x) for all x.
Generate random variates from this. - Keep the above value with prob.
g(x)f(x)/Ch(x)lt1. - This gives us a random variate from f(x).
29For best results
- C should be close to 1. In fact 1/C is known as
the efficiency of the algorithm.
30Acceptance-Rejection Method
- Let f(x) be the density from which we want random
variates. - Write, f(x)Ch(x)g(x)
- Where, C?1, a constant
- h(x) is a density function from which we can
generate random variates - g(x) a function s.t. 0 ? g(x) ? 1
- So,
- Generate random variate u from U(0,1)
- Generate independent random variate x from h(x)
- If u ? g(x) then return x otherwise start again.
31Example
- To generate a random variate from f(x)3x2 on
0,1 - Let h(x) be U(0,1)
- Then C3 and g(x)x2.
- So,
- Generate u and v from U(0,1)
- If u?v2 then v otherwise step 1
32Exercise
- Generate a random variate from
33Acceptance-Rejection Method
- To Prove
- If random variable X has pdf f(x) s.t.
- f(x)Ch(x)g(x)
- where h(x) is the pdf of random variable Y
- and if U is U(0,1) then fY(x?U ?g(Y)fX(x).
34Acceptance-Rejection Method
- Proof
- By Bayes,
- fY(x?U ?g(Y)P(Ultg(Y) ?Yx)h(x)/P(U?g(Y))
- But
- P(Ultg(Y) ?Yx) P(Ultg(x))g(x)
- And
- P(U?g(Y))1/C try proving this
- Hence QED
35Special Algorithms
- Two special algorithms to generate variates from
standard Normal - Box-Muller
- Polar
36Box-Muller Algorithm
- Generate two independent U(0,1) variates, u1 and
u2. - Calculate
- Z1(-2lnu1)½cos(2?u2)
- Z2(-2lnu1)½sin(2?u2)
- Then z1 and z2 are independent random variates
from a standard normal. - Remark This is relatively slow as we need to
evaluate cos and sin functions.
37Polar Algorithm
- Generate two independent U(0,1) variates, u1 and
u2. - Let v12u1-1 v22u2-1 sv12v22
- If sgt1 then start again. Otherwise,
- z1(-2lns/s)½v1
- z2 (-2lns/s)½v2
- Then z1 and z2 are independent random variates
from N(0,1). - Remark This method is essentially the same as
the Box-Muller method but, to avoid calculating
cosine and sine, uses the acceptance-rejection
method.
38Re-using Random No. Streams
- Let X be a stochastic model. Suppose were
interested in ?EX (as we often are). We
simulate random variates of X, denoted x1,x2,,xn
and estimate ? as the sample mean. Now we want to
do - Sensitivity analysis
- Calculate the derivative of the mean.
- Do comparative/performance evaluation.
- In these cases one should use the same stream of
random variates so as not to introduce another
(spurious) source of variability that hampers the
interpretation of the results.
39Sensitivity/Numerical Evaluation of Derivatives
- Let us say that ? is a function of some
parameter, ?, which we must estimate. How
sensitive is our estimate of ? to our estimate of
?? - Calculate xk(??)-xk(?) for each simulation k
(i.e., reusing random numbers) and then average
over k, is clearly the best way. - Similarly to calculate derivative of ? at ?, use
xk(??)-xk(?)/ ? for sufficiently small ? and
then average of k. - We do not want random error inherent in our
realisation to affect our answerso we reuse our
random number stream to ensure any bias they
create is in both realisations and hence they
(largely) nullify each other.
40How many simulations should we do?
- Set tolerance level ? for maximum error in
estimating output of interest, Q. - We also need to set a confidence level of 1-?
(i.e., prob. that error exceeds ? is less than
?). - Think about it
41How many stimulations?
- The number of simulations, n, to achieve an
absolute error less than ?, with probability
greater than 1-? is approximately - where n0 is some reasonable number so that the
estimate of the variance is reasonable (n0gt30
anyway for Normality assumption).
42So, for absolute error less than ? with prob. at
least 1-?
- Generate a decent number of realisations (at
least 30) x1, x2,,xn - If
- Stop required accuracy achieved.
- Otherwise generate extra realisations (i.e.,
increase n) and goto 2.
43For relative error less than ? with prob. at
least 1-?
- Generate a decent number of realisations (at
least 30) x1, x2,,xn - If
- Stop required accuracy achieved.
- Otherwise generate extra realisations (i.e.,
increase n) and goto 2.
44Generation of sets of correlated normal random
variates
- We want to generate a multivariate normal random
variable, fully parameterised by - ?iEXi, i1,..n
- ?ijCovXi,Xj, a symmetric nxn matrix, called C.
45Recall Density function of Multivariate Normal
where
46Generation of sets of correlated normal random
variates
- Generate indep. standard normals z1, z2,,zn
using earlier methods. - Then put,
- where the ljk are chosen so that the covariance
matrix of the Xis coincide with C. - In fact, lik is a lower-triangular matrix s.t.
CLLT, a decomposition of C that is known as the
Cholesky decomposition.
47Straightforward to solve for lij
for igtj
and,
With obvious convention that
48Concludes Chapter 3