CPE 619 Introduction To Simulation

About This Presentation

Title:

CPE 619 Introduction To Simulation

Description:

Other Causes of Simulation Analysis Failure. Checklist for Simulations. Terminology ... (e.g., the umber of jobs in CPU scheduling simulation) ... – PowerPoint PPT presentation

Number of Views:84

Avg rating:3.0/5.0

Slides: 84

Provided by: Mil36

Learn more at: http://www.ece.uah.edu

Category:

more less

Transcript and Presenter's Notes

Title: CPE 619 Introduction To Simulation

1
CPE 619Introduction To Simulation

Aleksandar Milenkovic
The LaCASA Laboratory
Electrical and Computer Engineering Department
The University of Alabama in Huntsville
http//www.ece.uah.edu/milenka
http//www.ece.uah.edu/lacasa

2
Overview

Simulation Key Questions
Introduction to Simulation
Common Mistakes in Simulation
Other Causes of Simulation Analysis Failure
Checklist for Simulations
Terminology
Types of Models

3
Simulation Key Questions

What are the common mistakes in simulation and
why most simulations fail?
What language should be used for developing a
simulation model?
What are different types of simulations?
How to schedule events in a simulation?
How to verify and validate a model?
How to determine that the simulation has reached
a steady state?
How long to run a simulation?

4
Simulation Key Questions (contd)

How to generate uniform random numbers?
How to verify that a given random number
generator is good?
How to select seeds for random number generators?
How to generate random variables with a given
distribution?
What distributions should be used and when?

5
Introduction to Simulation

The best advice to those about to embark on a
very large simulation is often the same as
Punch's famous advice to those about to marry
Don't!
-Brately, Fox, and Schrage (1987)

6
Common Mistakes in Simulation

1. Inappropriate Level of DetailMore detail Þ
More time Þ More Bugs Þ More CPU Þ More
parameters ¹ More accurate
2. Improper Language
General purpose Þ More portable, More
efficient, More time
3. Unverified Models Bugs
4. Invalid Models Model vs. reality
5. Improperly Handled Initial Conditions
6. Too Short Simulations Need confidence
intervals
7. Poor Random Number Generators Safer to use a
well-known generator
8. Improper Selection of Seeds Zero seeds, Same
seeds for all streams

7
Other Causes of Simulation Analysis Failure

1. Inadequate Time Estimate
2. No Achievable Goal
3. Incomplete Mix of Essential Skills
(a) Project Leadership
(b) Modeling and
(c) Programming
(d) Knowledge of the Modeled System
4. Inadequate Level of User Participation
5. Obsolete or Nonexistent Documentation
6. Inability to Manage the Development of a Large
Complex Computer Program Need software
engineering tools
7. Mysterious Results

8
Checklist for Simulations

1. Checks before developing a simulation
(a) Is the goal of the simulation properly
specified?
(b) Is the level of detail in the model
appropriate for the goal?
(c) Does the simulation team include
personnel with project
leadership, modeling, programming, and
computer systems
backgrounds?
(d) Has sufficient time been planned for the
project?
2. Checks during development
(a) Has the random number generator used in
the simulation
been tested for uniformity and
independence?
(b) Is the model reviewed regularly with the
end user?
(c) Is the model documented?

9
Checklist for Simulations (contd)

3.Checks after the simulation is running
(a) Is the simulation length appropriate?
(b) Are the initial transients removed before
computation?
(c) Has the model been verified thoroughly?
(d) Has the model been validated before using
its results?
(e) If there are any surprising results, have
they been validated?
(f) Are all seeds such that the random number
streams will not overlap?

10
Terminology

Introduce terms using an example of simulating
CPU scheduling
Study various scheduling techniques given job
characteristics, ignoring disks, display
State Variables Define the state of the system
Can restart simulation from state variables
E.g., length of the job queue.
Event Change in the system state
E.g., arrival, beginning of a new execution,
departure

11
Terminology Types of Models

Continuous Time Model
State is defined at all times
Discrete Time Models
State is defined only at some instants

12
Terminology Types of Models (contd)

Continuous State Model
State variables are continuous
Discrete State Models
State variables are discrete

13
Terminology Types of Models (contd)

Discrete state Discrete event model
Continuous state Continuous event model
Continuity of time ¹ Continuity of state
Four possible combinations
1. discrete state/discrete time
2. discrete state/continuous time
3. continuous state/discrete time
4. continuous state/continuous time

14
Terminology Types of Models (contd)

Deterministic and Probabilistic Models
Deterministic - If output predicted with
certainty
Probabilistic - If output different for different
repetitions

15
Terminology Types of Models (contd)

Static and Dynamic Models
Static - Time is not a variable
Dynamic - If changes with time
E.g. CPU scheduler is dynamic, while
matter-to-energy model Emc2 is static
Linear and nonlinear models
Linear - Output is linear combination of input
Nonlinear - Otherwise

16
Terminology Types of Models (contd)

Open and closed models
Open - Input is external and independent
Closed - Model has no external inputs
Ex if same jobs leave and re-enter queue then
closed, while if new jobs enter system then open

17
Terminology Types of Models (contd)

Stable and unstable
Stable - Model output settles down
Unstable - Model output always changes

18
Computer System Models

Continuous time
Discrete state
Probabilistic
Dynamic
Nonlinear
Open or closed
Stable or unstable

19
Selecting a Language for Simulation

Four choices
1. Simulation language
2. General purpose
3. Extension of a general purpose language
4. Simulation package

20
Selecting a Language for Simulation (contd)

Simulation language built in facilities for
time steps, event scheduling, data collection,
reporting
General-purpose known to developer, available
on more systems, flexible
The major difference is the cost tradeoff (SL vs.
GPL)
SL save development time (if you know it), more
time for system specific issues, more readable
code
SL- requires startup time to learn
GPL Analyst's familiarity, availability, quick
startup
GPL- may require more time to add simulation
flexibility, portability, flexibility
Recommendation may be for all analysts to learn
one simulation language so understand those
costs and can compare

21
Selecting a Language for Simulation

Extension of general-purpose collection of
routines and tasks commonly used. Often, base
language with extra libraries that can be called
Simulation packages allow definition of model
in interactive fashion. Get results in one day
Tradeoff is in flexibility, where packages can
only do what developer envisioned, but if that is
what is needed then is quicker to do so
Examples GASP (for FORTRAN)
Collection of routines to handle simulation tasks
Compromise for efficiency, flexibility, and
portability.
Examples QNET4, and RESQ
Input dialog
Library of data structures, routines, and
algorithms
Big time savings
Inflexible Þ Simplification

22
Types of Simulation Languages

Continuous Simulation Languages
CSMP, DYNAMO
Differential equations
Used in chemical engineering
Discrete-event Simulation Languages
SIMULA and GPSS
Combined
SIMSCRIPT and GASP
Allow discrete, continuous, as well as combined
simulations.

23
Types of Simulations

1. Emulation Using hardware or firmware
2. Monte Carlo Simulation
3. Trace-Driven Simulation
4. Discrete Event Simulation

24
Types of Simulations (contd)

Emulation
Simulation that runs on a computer to make it
appear to be something else
Examples JVM, NIST Net

25
Types of Simulation (contd)

Monte Carlo method Origin after Count
Montgomery
de Carlo, Italian gambler and random-number
generator (1792-1838). A method of jazzing up
the
action in certain statistical and number-analytic
environments by setting up a book and inviting
bets on
the outcome of a computation.
- The Devil's DP
Dictionary
McGraw Hill
(1981)

26
Monte Carlo Simulation

A static simulation has no time parameter
Runs until some equilibrium state reached
Used to model physical phenomena, evaluate
probabilistic system, numerically estimate
complex mathematical expression
Driven with random number generator
So Monte Carlo (after casinos) simulation
Example, consider numerically determining the
value of ?
Area of circle ?2 for radius 1

27
Monte Carlo Simulation (contd)

Imagine throwing dart at square
Random x (0,1)
Random y (0,1)
Count if inside
sqrt(x2y2) lt 1
Compute ratio R
in / (in out)
Can repeat as many times as needed to get
arbitrary precision

Unit square area of 1
Ratio of area in quarter to area in square R
? 4R

28
Monte Carlo Simulation (contd)

Evaluate the following integral
1. Generate uniformly distributed x
Uniform(0,2)
2. Density function f(x)1/2 iff 0?x ?2
3. Compute

29
Monte Carlo Simulation (contd)

Expected value for y

30
Trace-Driven Simulation

Uses time-ordered record of events on real
system as input
Example to compare memory management, use trace
of page reference patterns as input, and can
model and simulate page replacement algorithms
Note, need trace to be independent of system
Example if had trace of disk events, could not
be used to study page replacement since events
are dependent upon current algorithm

31
Advantages of Trace-Driven Simulations

1. Credibility
2. Easy Validation Compare simulation with
measured
3. Accurate Workload Models correlation and
interference
4. Detailed Trade-Offs
Detailed workload Þ Can study small changes
in algorithms
5. Less Randomness
Trace Þ deterministic input Þ Fewer
repetitions
6. Fair Comparison Better than random input
7. Similarity to the Actual Implementation
Trace-driven model is similar to the system
Þ Can understand complexity of implementation

32
Disadvantages of Trace-Driven Simulations

1. Complexity More detailed
2. Representativeness Workload changes with
time, equipment
3. Finiteness Few minutes fill up a disk
4. Single Point of Validation One trace one
point
5. Detail
6. Trade-Off Difficult to change workload

33
Discrete Event Simulations

A simulation using a discrete state model of the
system is DISCRETE EVENT SIMULATION
Continuous-event simulations the state of the
system takes continuous values
Typical components
Event scheduler
Simulation Clock and a Time Advancing Mechanism
System State Variables
Event Routines
Input Routines
Report Generator
Initialization Routines
Trace Routines
Dynamic Memory Management
Main Program

34
Components of Discrete Event Simulations

Event scheduler linked list of events waiting
Schedule event X at time T
Hold event X for interval dt
Cancel previously scheduled event X
Hold event X indefinitely until scheduled by
other event
Schedule an indefinitely scheduled event
Note, event scheduler executed often, so has
significant impact on performance
Simulation Clock and a Time Advancing Mechanism
Global variable representing simulated time
(maintained by the scheduler)
Two approaches
Unit-time approach increment time and check for
events
Event-driven approach move to the next event in
queue

35
Components of Discrete Events Sims (contd)

System State Variable
Global variables describing the state of the
systems(e.g., the umber of jobs in CPU
scheduling simulation)
Local variables (e.g., CPU time required for a
job is placed in the data structure for that
particular job)
Event Routines -- one per event update state
variables and schedule other events
E.g., job arrivals, job scheduling, and job
departure
Input Routines
Get model parameters (e.g., means CPU time per
job) from the user
Very parameters in a range

36
Components of Discrete Events Sims (contd)

Report Generator
Output routines run at the end of the simulation
Initialization Routines
Set the initial state of the system state
variables. Initialize seeds.
Trace Routines
Print out intermediate variables as the
simulation proceeds
On/off feature
Dynamic Memory Management
New entities are created and old ones are
destroyed
Periodic garbage collection
Main Program
Tie everything together

37
Event-Set Algorithms

Event Set Ordered linked list of future event
notices
Insert vs. Execute next
1. Ordered Linked List SIMULA, GPSS, and GASP
IV
Search from left or from right

38
Event-Set Algorithms (contd)

2. Indexed Linear List
Array of indexes Þ No search to find the sub-list
Fixed or variable Dt. Only the first list is kept
sorted

39
Event-Set Algorithms (Cont)

3. Tree Structures Binary tree Þ log2 n
Special case Heap Event is a node in binary
tree

40
Summary

Common Mistakes Detail, Invalid, Short
Discrete Event, Continuous time, nonlinear models
Monte Carlo Simulation Static models
Trace driven simulation Credibility, difficult
trade-offs
Even Set Algorithms Linked list, indexed linear
list, heaps

41
Analysis of Simulation Results
42
Overview

Analysis of Simulation Results
Model Verification Techniques
Model Validation Techniques
Transient Removal
Terminating Simulations
Stopping Criteria Variance Estimation
Variance Reduction

43
Model Verification vs. Validation

The model output should be close to that of real
system
Make assumptions about behavior of real systems
1st step, test if assumptions are reasonable
Validation, or representativeness of assumptions
2nd step, test whether model implements
assumptions
Verification, or correctness
Four Possibilities
Unverified, Invalid
Unverified, Valid
Verified, Invalid
Verified, Valid

44
Model Verification Techniques

Top Down Modular Design
Anti-bugging
Structured Walk-Through
Deterministic Models
Run Simplified Cases
Trace
On-Line Graphic Displays
Continuity Test
Degeneracy Tests
Consistency Tests
Seed Independence

45
Top Down Modular Design

Divide and Conquer
Modules Subroutines, Subprograms, Procedures
Modules have well defined interfaces
Can be independently developed, debugged, and
maintained
Top-down design Þ Hierarchical structure Þ
Modules and sub-modules

46
Top Down Modular Design (contd)
Computer Network Simulator for Congestion Control
studies
47
Top Down Modular Design (contd)
48
Verification Techniques

Anti-bugging Include self-checks
å Probabilities 1
Jobs left Generated - Serviced
Structured Walk-Through
Explain the code another person or group
Works even if the person is sleeping
Deterministic Models Use constant values
Run Simplified Cases
Only one packet
Only one source
Only one intermediate node

49
Verification Techniques (contd)

Trace Time-ordered list of events and variables
Several levels of detail
Events trace
Procedure trace
Variables trace
User selects the detail
Include on and off

50
Verification Techniques (contd)

On-Line Graphic Displays
Make simulation interesting
Help selling the results
More comprehensive than trace

51
Verification Techniques (contd)

Continuity Test
Run for different values of input parameters
Slight change in input Þ slight change in output
If not, investigate

Before
After
52
Verification Techniques (contd)

Degeneracy Tests Try extreme configuration and
workloads
One CPU, Zero disk
Consistency Tests
Similar result for inputs that have same effect
Four users at 100 Mbps vs. Two at 200 Mbps
Build a test library of continuity, degeneracy
and consistency tests
Seed Independence Similar results for different
seeds

53
Model Validation Techniques

Ensure assumptions used are reasonable
Final simulated system should be like the real
system
Unlike verification, techniques to validate one
simulation may be different from one model to
another
Three key aspects to validate
Assumptions
Input parameter values and distributions
Output values and conclusions
Compare validity of each to one or more of
Expert intuition
Real system measurements
Theoretical results

? 9 combinations
Not all are
always possible,
however

54
Expert Intuition

Most practical and common way
Experts Involved in design, architecture,
implementation, analysis, marketing, or
maintenance of the system
Present assumption, input, output
Better to validate one at a time
See if the experts can distinguish simulation vs.
measurement

55
Real System Measurements

Most reliable and preferred
May be unfeasible because system does not exist
or too expensive to measure
That could be why simulating in the first place!
But even one or two measurements add an enormous
amount to the validity of the simulation
Should compare input values, output values,
workload characterization
Use multiple traces for trace-driven simulations
Can use statistical techniques (confidence
intervals) to determine if simulated values
different than measured values

56
Theoretical Results

Can be used to compare a simplified system with
simulated results
May not be useful for sole validation but can be
used to complement measurements or expert
intuition
E.g. measurement validates for one processor,
while analytic model validates for many
processors
Note, there is no such thing as a fully
validated model
Would require too many resources and may be
impossible
Can only show is invalid
Instead, show validation in a few select cases,
to lend confidence to the overall model results

57
Transient Removal

Most simulations only want steady state
Remove initial transient state
Trouble is, not possible to define exactly what
constitutes end of transient state
Use heuristics
Long runs
Proper initialization
Truncation
Initial data deletion
Moving average of replications
Batch means

58
Long Runs

Use very long runs
Effects of transient state will be amortized
But wastes resources
And tough to choose how long is enough
Recommendation dont use long runs alone

59
Proper Initialization

Start simulation in state close to expected state
Ex CPU scheduler may start with some jobs in the
queue
Determine starting conditions by previous
simulations or simple analysis
May result in decreased run length, but still may
not provide confidence that are in stable
condition

60
Truncation

Assume variability during steady state is less
than during transient state
Variability measured in terms of range
(min, max)
If a trajectory of range stabilizes, then assume
that in stable state

Method
Given n observations x1, x2, , xn
Ignore first l observations
Calculate (min,max) of remaining n-l
Repeat for l 1n
Stop when l1th observation is neither min nor
max

61
Truncation Example

So, discard first 9 observations

Sequence 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 10,
9, 10, 11, 10, 9
Ignore first (l1), range is (2, 11) and 2nd
observation (l1) is the min
Ignore second (l2), range is (3,11) and 3rd
observation (l1) is min
Finally, l9 and range is (9,11) and 10th
observation is neither min nor max

62
Truncation Example 2 (1 of 2)

Find duration of transient interval for 11, 4,
2, 6, 5, 7, 10, 9, 10, 9, 10, 9, 10

63
Truncation Example 2 (2 of 2)

Find duration of transient interval for
11, 4, 2, 6, 5, 7, 10, 9, 10, 9, 10, 9, 10
When l3, range is (5,10) and 4th (6) is not min
or max

So, discard only 3 instead of 6

64
Initial Data Deletion (1 of 3)

Study average after some initial observations are
deleted from sample
If average does not change much, must be deleting
from steady state
However, since randomness can cause some
fluctuations during steady state, need multiple
runs (w/different seeds)
Given m replications size n each with xij jth
observation of ith replication
Note j varies along time axis and i varies
across replications

65
Initial Data Deletion (2 of 3)

Get mean trajectory
xj (1/m)?xij j1,2,,n
Get overall mean x (1/n)?xj j1,2,,n
Set l1. Assume transient state l long, delete
first l and repeat for remaining n-l
xl (1/(n-l))?xj jl1,,n
Compute relative change
(xl x) / x
Repeat with l from 1 to n-1. Plot. Relative
change graph will stabilize at knee. Choose l
there and delete 1 through l

66
Initial Data Deletion (3 of 3)
67
Moving Average of Independent Replications

Compute mean over moving time window
Get mean trajectory
xj (1/m)?xij j1,2,,n
Set k1. Plot moving average of 2k1 values
Mean xj 1/(2k1) ?(xjl)
With jk1, k2,,n-k
With l-k to k
Repeat for k2,3 and plot until smooth
Find knee. Value at j is length of transient
phase.

68
Batch Means

Run for long time
N observations
Divide up into batches
m batches size n each so m N/n
Compute batch mean (xi)
Compute var of batch means as function of batch
size (X is overall mean)
Var(x) (1/(m-1))?(xi-X)2
Plot variance versus size n
When n starts decreasing, have transient

69
Terminating Simulations

For some simulations, transition state is of
interest no transient removals required
Sometimes upon termination you also get final
conditions that do not reflect steady state
Can apply transition removal conditions to end of
simulation
Take care when gathering at end of simulation
E.g. mean service time should include only those
that finish
Also, take care of values at event times
E.g. queue length needs to consider area under
curve
Say t0 two jobs arrive, t1 one leaves, t4 2nd
leaves
qlengths q02, q11 q40 but q average not
(210)/31
Instead, area is 2 1 1 1 so q average
5/41.25

70
Stopping Criteria Variance Estimation

Run until confidence interval is narrow enough
For Independent observations
Independence not applicable to most simulations
Large waiting time for ith job Þ Large waiting
time for (i1)th job
For correlated observations

71
Variance Estimation Methods

1. Independent Replications
2. Batch Means
3. Method of Regeneration

72
Independent Replications

Assumes that means of independent replications
are independent
Conduct m replications of size nn0 each
1. Compute a mean for each replication
2. Compute an overall mean for all replications

73
Independent Replications (contd)

3. Calculate the variance of replicate means
4. Confidence interval for the mean response is
Keep replications large to avoid waste
Ten replications generally sufficient

74
Batch Means

Also called method of sub-samples
Run a long simulation run
Discard initial transient interval, and Divide
the remaining observations run into several
batches or sub-samples.
1. Compute means for each batch
2. Compute an overall mean

75
Batch Means (contd)

3. Calculate the variance of batch means
4. Confidence interval for the mean response
is
Less waste than independent replications
Keep batches long to avoid correlation
Check Compute the auto-covariance of successive
batch means
Double n until autocovariance is small

76
Case Study 25.1 Interconnection Networks

Indirect binary n-cube networks Used for
processor-memory interconnection
Two stage network with full fan out.
At 64, autocovariance lt 1 of sample variance

77
Method of Regeneration
Regeneration Points
QueueLength

Behavior after idle period does not depend upon
the past history Þ System takes a new birthÞ
Regeneration point
Note The regeneration point are the beginning of
the idle interval. (not at the ends as shown in
the book).

78
Method of Regeneration (contd)

Regeneration cycle Between two successive
regeneration points
Use means of regeneration cycles
Problems
Not all systems are regenerative
Different lengths Þ Computation complex
Overall mean ¹ Average of cycle means
Cycle means are given by

79
Method of Regeneration (contd)

Overall mean
1. Compute cycle sums
2. Compute overall mean
3. Calculate the difference between expected and
observed cycle sums

80
Method of Regeneration (contd)

4. Calculate the variance of the differences
5. Compute mean cycle length
6. Confidence interval for the mean response is
given by
7. No need to remove transient observations

81
Method of Regeneration Problems

1. The cycle lengths are unpredictable. Can't
plan the simulation time beforehand.
2. Finding the regeneration point may require a
lot of checking after every event.
3. Many of the variance reduction techniques can
not be used due to variable length of the cycles.
4. The mean and variance estimators are biased

82
Variance Reduction