Slide 1 of 39 - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Slide 1 of 39

Description:

Further Statistical Issues Chapter 12 Last revision August 26, 2006, 2003 What We ll Do ... Random-number generation Generating random variates Nonstationary ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 40
Provided by: KeltonSad88
Category:
Tags: hose | streams

less

Transcript and Presenter's Notes

Title: Slide 1 of 39


1
Further Statistical Issues
Chapter 12
Last revision August 26, 2006, 2003
2
What Well Do ...
  • Random-number generation
  • Generating random variates
  • Nonstationary Poisson processes
  • Variance reduction
  • Sequential sampling
  • Designing and executing simulation experiments
  • Backup material
  • Appendix C A Refresher on Probability and
    Statistics
  • Appendix D Arenas Probability Distributions

3
Random-Number Generators (RNGs)
  • Algorithm to generate independent, identically
    distributed draws from the continuous UNIF (0, 1)
    distribution
  • These are called random numbersin simulation
  • Basis for generating observations from all other
    distributions and random processes
  • Transform random numbers in a way that depends on
    the desired distribution or process (later in
    this chapter)
  • Its essential to have a good RNG
  • There are a lot of bad RNGs this is very tricky
  • Methods, coding are both tricky

4
The Nature of RNGs
  • Recursive formula (algorithm)
  • Start with a seed (or seed vector)
  • Do something weird to the seed to get the next
    one
  • Repeat generate the same sequence
  • Will eventually repeat (cycle) want long cycle
    length
  • Not really random as in unpredictable
  • Does this matter? Philosophically? Practically?
  • Want to design RNGs
  • Long cycle length
  • Good statistical properties (uniform,
    independent) -- tests
  • Fast
  • Streams (subsegments) many and long (for
    variance reduction later)
  • This is hard! Doing something weird isnt enough

5
Linear Congruential Generators(LCGs)
  • The most common of several different methods
  • But not the one in Arena (though its related)
    later
  • Generate a sequence of integers Z1, Z2, Z3, via
    the recursion
  • Zi (a Zi1 c) (mod m)
  • a, c, and m are carefully chosen constants
  • Specify a seed Z0 to start off
  • mod m means take the remainder of dividing by m
    as the next Zi
  • All the Zis are between 0 and m 1
  • Return the ith random number as Ui Zi / m

6
Example of a Toy LCG
  • Parameters m 63, a 22, c 4, Z0 19
  • Zi (22 Zi1 4) (mod 63), seed with Z0 19
  • i 22 Zi14 Zi Ui
  • 0 19
  • 1 422 44 0.6984
  • 2 972 27 0.4286
  • 3 598 31 0.4921
  • 4 686 56 0.8889
  • 61 158 32 0.5079
  • 62 708 15 0.2381
  • 63 334 19 0.3016
  • 64 422 44 0.6984
  • 65 972 27 0.4286
  • 66 598 31 0.4921
  • Cycling will repeat forever
  • Cycle length m
  • (could be ltlt m depending
  • on parameters)
  • Pick m BIG
  • But that might not be enoughfor good statistical
    properties

7
Issues with LCGs
  • Cycle length lt m
  • Typically, m 2.1 billion ( 231 1) or more
  • Which used to be a lot more later
  • Other parameters chosen so that cycle length m
    or m 1
  • Statistical properties
  • Uniformity, independence
  • There are many tests of RNGs
  • Empirical tests
  • Theoretical tests lattice structure (next
    slide )
  • Speed, storage both are usually fine
  • Must be carefully, cleverly coded BIG integers
  • Reproducibility streams (long internal
    subsequences) with fixed seeds

8
Issues with LCGs (contd.)
  • Regularity of LCGs (and other kinds of RNGs)
    For the earlier toy LCG
  • Design RNGs dense lattice in high dimensions
  • Other kinds of RNGs longer memory in recursion,
    combination of several RNGs

Plot of Ui vs. i
Plot of Ui1 vs. Ui
Random Numbers Fall Mainly in the Planes
George Marsaglia
9
Original (1983) Arena RNG
  • LCG with m 231 1 2,147,483,647
  • a 75 16,807
  • c 0
  • Cycle length m 1 2.1 ? 109
  • Ten different automatic streams with fixed seeds
  • In its day, this was a good, well-tested
    generator in an efficient code
  • But current computer speed make this cycle length
    inadequate
  • Exhaust in 10 minutes on 2 GHz PC if we just
    generate
  • Can get this generator (not recommended)
  • Place a Seeds module (Elements panel) in your
    model

10
Current Arena RNG
  • Uses some of the same ideas as LCG
  • Modulo division, recursive on earlier values
  • But is not an LCG
  • Combines two separate component generators
  • Recursion involves more than just the preceding
    value
  • Combined multiple recursive generator (CMRG)
  • An (1403580 An-2 810728 An-3) mod 4294967087
  • Bn (527612 Bn-1 1370589 Bn-3) mod 4294944443
  • Zn (An Bn) mod 4294967087
  • Seed a six-vector of first three Ans, Bns

Two simultaneous recursions
Combine the two
The next random number
11
The Current (2000) Arena RNG Properties
  • Extremely good statistical properties
  • Good uniformity in up to 45-dimensional hypercube
  • Cycle length 3.1 ? 1057
  • To cycle, all six seeds must match up
  • On 2 GHz PC, would take 2.8 ? 1040 millennia to
    exhaust
  • Under Moores law, it will be 216 years until
    this generator can be exhausted in a year of
    nonstop computing
  • Only slightly slower than old LCG
  • And RNG is usually a minor part of overall
    computing time

12
The Current (2000) Arena RNG Streams and
Substreams
  • Automatic streams and substreams
  • 1.8 ? 1019 streams of length 1.7 ? 1038 each
  • Each stream further divided into 2.3 ? 1015
    substreams of length 7.6 ? 1022 each
  • 2 GHz PC would take 669 million years to exhaust
    a substream
  • Default stream is 10 (historical reasons)
  • Also used for Chance-type Decide module
  • To use a different stream, append its number
    after a distributions parameters
  • For example, EXPO(6.7, 4) to use stream 4
  • When using multiple replications, Arena
    automatically advances to next substream in each
    stream for the next replication
  • Helps synchronize for variance reduction ... later

13
Generating Random Variates
  • Have Desired input distribution for model
    (fitted or specified in some way), and RNG (UNIF
    (0, 1))
  • Want Transform UNIF (0, 1) random numbers into
    draws from the desired input distribution
  • Method Mathematical transformations of random
    numbers to deform them to the desired
    distribution
  • Specific transform depends on desired
    distribution
  • Details in online Help about methods for all
    distributions
  • Do discrete, continuous distributions separately

14
Generating from Discrete Distributions
  • Example probability mass function
  • Divide 0, 1
  • Into subintervals of length 0.1, 0.5, 0.4
  • Generate U UNIF (0, 1)
  • See which subinterval its in
  • Return X corresponding value

15
Discrete Generation Another View
  • Plot cumulative distribution function generate U
    and plot on vertical axis read across and down
  • Inverting the CDF
  • Equivalent to earlier method

16
Generating from Continuous Distributions
  • Example EXPO (5) distribution
  • Density (PDF)
  • Distribution (CDF)
  • General algorithm (can be rigorously justified)
  • 1. Generate a random number U UNIF(0, 1)
  • 2. Set U F(X) and solve for X F1(U)
  • Solving analytically for X may or may not be
    simple (or possible)
  • Sometimes use numerical approximation to solve

17
Generating from Continuous Distributions (contd.)
  • Solution for EXPO (5) case
  • Set U F(X) 1 eX/5
  • eX/5 1 U
  • X/5 ln (1 U)
  • X 5 ln (1 U)
  • Picture (inverting the CDF, as in discrete case)

Intuition (garden hose) More Us will hit F(x)
where its steep This is where the density f(x)
is tallest, and we want a denser distribution of
Xs
18
Nonstationary Poisson Processes
  • Many systems have externally originating events
    affecting them e.g., arrivals of customers
  • If process is stationary over time, usually
    specify a fixed interevent-time distribution
  • But process could vary markedly in its rate
  • Fast-food lunch rush
  • Freeway rush hours
  • Ignoring nonstationarity can lead to serious
    model and output errors
  • Already seen this enhanced call center, Models
    5-2, 5-3

19
Nonstationary Poisson Processes Definition
  • Usual model nonstationary Poisson process
  • Have a rate function l(t)
  • Number of events in t1, t2 Poisson with mean
  • Issues
  • How to estimate rate function?
  • Given an estimate, how to generate during
    simulation?

l(t)
t
20
Nonstationary Poisson Processes Estimating the
Rate Function
  • Estimation of the rate function
  • Probably the most practical method is piecewise
    constant
  • Decide on a time interval within which rate is
    fixed
  • Estimate from data the (constant) rate during
    each interval
  • Be careful to get the units right
  • Other (more complicated) methods exist in the
    literature

21
Nonstationary Poisson Processes Generation
  • Arena has a built-in method to generate, assuming
    a piecewise-constant rate function
  • Arrival Schedule in Create module Models 5-2,
    5-3
  • Method is to invert a rate-one stationary Poisson
    process against the cumulate rate function L
  • Similar to inverting CDF for continuous
    random-variable generation
  • Exploits some speed-up possibilities
  • Details in Help topic Non-Stationary Exponential
    Distribution
  • Alternative method thinning of a stationary
    Poisson process at the peak rate

22
Variance Reduction
  • Random input Þ random output (RIRO)
  • In other words, output has variance
  • Higher output variance means less precise results
  • Would like to eliminate or reduce output variance
  • One (bad) way to eliminate replace all input
    random variables by constants (like their mean)
  • Will get rid of random output, but will also
    invalidate model
  • Thus, best hope is to reduce output variance
  • Easy (brute-force) variance reduction just
    simulate some more
  • Terminating additional replications
  • Steady-state additional replications or a
    longer run

23
Variance Reduction (contd.)
  • But sometimes can reduce variance without more
    runs free lunch (?)
  • Key unlike physical experiments, can control
    randomness in computer-simulation experiments via
    manipulating the RNG
  • Re-use the same random numbers either as they
    were, in some opposite sense, or for a similar
    but simpler model
  • Several different variance-reduction techniques
  • Classified into categories common random
    numbers, antithetic variates, control variates,
    indirect estimation,
  • Usually requires thorough understanding of model,
    code
  • Will look only at common random numbers in detail

24
Common Random Numbers (CRN)
  • Applies when objective is to compare two (or
    more) alternative configurations or models
  • Interest is in difference(s) of performance
    measure(s) across alternatives
  • Model 7-2 (small mfg. system), total avg. WIP
    output two alternatives
  • A. Base case (as is)
  • B. 3.5 increase in business (interarrival-time
    mean falls from 13 to 12.56 minutes)
  • Same run conditions, but change model into Model
    12-1
  • Remove Output File Total WIP History.dat
  • Add entry to Statistic module to compute and save
    to a .dat file the total avg. WIP on each
    replication

25
The Natural Comparison
  • Run case A, make the change to get to case B and
    run it, then Compare Means via Output Analyzer
  • Difference is not statistically significant
  • Were the runs of A and B statistically
    independent?
  • Did we use the same random numbers running A and
    B?
  • Did we use the same random numbers intelligently?

26
CRN Intuition
  • Get sharper comparison if you subject all
    alternatives to the same conditions
  • Then observed differences are due to model
    differences rather than random differences in the
    conditions
  • Small mfg. system For both A and B runs, cause
  • The same parts arrive at the same times
  • Be assigned same attributes (job type)
  • Have the same process times at each step
  • Then observed differences will be attributable to
    system differences, not random bounce
  • There isnt any random bounce

27
Synchronization of Random Numbers in CRN
  • Generally, get CRN by using the same RNG, seed,
    stream(s) for all alternatives
  • Already are using the same stream, default
    stream 10
  • But its usage generally gets mixed up across
    alternatives
  • Must use the same random numbers for the same
    purposes across the alternatives
    synchronization of random-number usage
  • Usually requires some work, understanding of
    model
  • Usually use different streams in the RNG
  • Usually different ways to do this in a given
    model
  • Sometimes cant synchronize completely for
    complex models settle for partial
    synchronization

28
Synchronization of Random Numbers in CRN
(contd.)
  • Synchronize by source of randomness (well do)
  • Assign stream to each point of variate generation
  • Separate random-number faucets extra
    parameter in r.v. calls
  • Model 12-1 14 sources of randomness, separate
    stream for each (see book for details), modify
    into Model 12-2
  • Fairly simple but might not ensure complete
    synchronization still usually get some benefit
    from this
  • Synchronize by entity (wont do see Exercises)
  • Pre-generate every possible random variate an
    entity might need when it arrives, assign to
    attributes, used downstream
  • Better synchronization insurance but uses more
    memory
  • Across replications, RNG automatically goes to
    next substream within each stream
  • Maintains synchronization if alternatives
    disagree on number of random numbers used per
    stream per replication

29
Effect of CRN
Natural Comparison
Synchronized CRN
  • CRN was no more computational effort
  • Effect here is fairly dramatic, but it will not
    always be this strong
  • Depends on how similar A and B are

30
CRN Statistical Issues
  • In Output Analyzer, Analyze gt Compare Means
    option, have choice of Paired-t or Two-Sample-t
    for c.i. test on difference between means
  • Paired-t subtracts results replication by
    replication must use this if doing CRN
  • Two-Sample-t treats the samples independently
    can use this if doing independent sampling, often
    better than Paired-t
  • Mathematical justification for CRN
  • Let X output r.v. from alternative A, Y
    output from B
  • Var(X Y) Var(X) Var(Y) 2 Cov(X, Y)

31
Other Variance-Reduction Techniques
  • For single-system simulation, not comparisons
  • Antithetic variates make pairs of runs
  • Use Us on first run of pair, 1 Us on
    second run of pair
  • Take average of two runs negatively correlated,
    reducing variance of average
  • Like CRN, must take care to synchronize
  • Control variates
  • Use internal variate generation to control
    results up, down
  • Indirect estimation
  • Simulation estimates something other than what
    you want, but related to what you want by a fixed
    formula

32
Sequential Sampling
  • Always try to quantify imprecision in results
  • If imprecision is small enough, youre done
  • If not, need to do something to increase
    precision
  • Just saw one way variance-reduction techniques
  • Obvious way to increase precision keep
    simulating one more step at a time, quit when
    you achieve desired precision
  • Terminating models step another replication
  • Cannot extend length of replications thats
    part of the model
  • Steady-state models
  • step another replication if using truncated
    replications, or
  • step some extension of the run if using batch
    means

33
Sequential Sampling Terminating Models
  • Modify Model 12-2 (small mfg., base case, with
    random-number streams) into Model 12-3
  • Made 25 replications, get 95 c.i. on expected
    average Total WIP as 13.03 1.28
  • Suppose the 1.28 is too big want to reduce to
    0.5
  • Approximate formulas from Sec. 6.3 need 124 or
    164 total replications (depending on which
    formula) rather than 25
  • Instead, just make one more at a time, re-compute
    c.i., stop as soon as half-width is less than 0.5
  • Trick Arena to keep making more replications
    until c.i. half-width lt tolerance 0.5

34
Sequential Sampling Terminating Models
(contd.)
  • Recall With gt 1 replication, automatically get
    cross-replication 95 c.i.s for expected values
    in Category Overview report
  • Related internal Arena variables
  • ORUNHALF(Output ID) half-width of 95 c.i.
    using completed replications (Output ID Avg
    Total WIP)
  • MREP total number of replications asked for
    (initially, MREP Number of Replications in Run
    gt Setup gt Replication Parameters)
  • NREP replication number now in progress ( 1,
    2, 3, )
  • Use, manipulate these variables

35
Sequential Sampling Terminating Models
(contd.)
  • Initially set MREP to huge value in Number of
    Replications in Run gt Setup gt Replication
    Parameters
  • Keep replicating until we cut off when half-width
    0.5
  • Add logic (in a submodel) to sense when done
  • Create one control entity at beginning of each
    replication
  • Control entity immediately checks to see if
  • NREP 2 This is beginning of 1st or 2nd
    replication
  • ORUNHALF(Avg Total WIP) gt 0.5 c.i. on completed
    replications is still too big
  • In either case, keep going with this replication
    (and the next one too) control entity is
    Disposed and takes no action
  • If both conditions are false, Control entity
    Assigns MREP NREP to stop after this
    replication, and is Disposed

36
Sequential Sampling Terminating Models
(contd.)
  • Details
  • This overshoots required number of replications
    by one
  • In Assign module setting MREP to NREP, have to
    select Type Other since MREP is a built-in
    Arena variable
  • Results Stopped with 232 total replications,
    yielding half width 0.49699 (barely less than
    0.5)
  • Different from earlier number-of-replications
    approximations (theyre just that)
  • Generalizations
  • Precision demands on several outputs
  • Relative-width stopping (half-width) / (pt.
    estimate) small

37
Sequential Sampling Steady-State Models
  • If doing truncated-replications approach to
    steady-state statistical analysis
  • Same strategy as above for terminating models
  • Warm-up Period specified in Run gt Setup gt
    Replication Parameters
  • Err on the side of too much warmup
  • Point-estimator bias is especially dangerous in
    sequential sampling
  • Getting tight c.i. centered in the wrong place
  • The tighter the c.i. demanded, the worse the
    coverage probability

38
Sequential Sampling Steady-State Models
(contd.)
  • Batch-means approach
  • Model 12-4 modification of Model 7-4 (small mfg.
    system) want half-width on E(average WIP) to be
    lt 1
  • Keep extending the run to reduce c.i. half-width
  • Use automatic run-time batch-means 95 c.i.s
  • Stopping criterion Terminating Condition field
    of Run gt Setup gt Replication Parameters
  • Half-width variables are THALF(Tally ID) or
    DHALF(Dstat ID)
  • For us, condition is DHALF(Total WIP) lt 1
  • Remove all other stopping devices from model
  • If Insuf or Corr would be returned because of
    too little data, half-width variables set to huge
    value keep going
  • Could demand multiple smallness criteria,
    relative precision (TAVG, DAVG variables for
    point-estimate denominator)

39
Designing and Executing Simulation Experiments
  • Think of a simulation model as a convenient
    testbed or laboratory for experimentation
  • Look at different output responses
  • Look at effects, interaction of different input
    factors
  • Apply classical experimental-design techniques
  • Factorial experiments main effects,
    interactions
  • Fractional-factorial experiments
  • Factor-screening designs
  • Response-surface methods, metamodels
  • CRN is blocking in experimental-design
    terminology
  • Process Analyzer (PAN) provides a convenient way
    to carry out a designed experiment
  • See Chapt. 6 for an example of using PAN for a
    factorial experiment
Write a Comment
User Comments (0)
About PowerShow.com