Title: Selecting Input Probability Distribution
1Selecting Input Probability Distribution
2Simulation Machine
- Simulation can be considered as an Engine with
input and output as follows
Simulation Engine
Output
Input
3Realizing Simulation
- Input Analysis is the analysis of the random
variables involved in the model such as - The distribution of IAT
- The distribution of Service Times
- Simulation Engine is the way of realizing the
model, this includes - Generating Random variables involved in the model
- Performing the requiring formulas.
- Output Analysis is the study of the data that are
produced by the Simulation engine.
4Input Analysis
- collect data from the field
- Analyze these data
- Two ways to analyze the data
- Build Empirical distribution and then sample from
this distribution. - Fit the data to a theoretical distribution ( such
as Normal, Exponential, etc.) See Chapter 6 of
Text for more distributions.
5How to select an Input Probability distribution
- Hypothesize a family of distributions.
- Estimate the parameters of the fitted
distributions - Determine how representative the fitted
distributions are - Repeat 1-3 until you get a fitted distribution
foe the collected data. Otherwise go with an
empirical distribution.
6Hypothesizing a Theoretical Distribution
- To Fit a Theoretical Distribution
- Need a good background of the theoretical
distributions (Consult your Text Section 6.2) - Histogram may not provide much insight into the
nature of the distribution. - Need Summary statistics
7Summary Statistics
- Mean
- Median
- Variance s2
- Coefficient of Variation (cv s/m) for
continuous distributions - Lexis ration (t s2/m) for discrete
distributions - Skewness index
8Summary Stats. Cont.
- If the Mean and the Median are close to each
others, and low Coefficient of Variation, we
would expect a Normally distributed data. - If the Median is less than the Mean, and s is
very close to the Mean (cv close to 1), we expect
an exponential distribution. - If the skewness (n close to 0) is very low then
the data are symmetric.
9Example
- Consider the following data
10Example Cont.
- Mean 5.654198
- Median 5.486928
- Standard Deviation 0.910188
- Skewness 0.173392
- Range 3.475434
- Minimum 4.132489
- Maximum 7.607923
11Example Continue
- We might take these data and construct a histogram
The given summary statistics and the histogram
suggest a Normal Distribution
12Empirical Distribution
13Disadvantages of Empirical distribution
- The empirical data may not adequately represent
the true underlying population because of
sampling error - The Generated RVs are bounded
- To overcome these two problems, we attempt to fit
a theoretical distribution.
14Estimation of Parameters of the fitted
distributions
- Suppose we hypothesized a distribution, then
- use the Maximum Likelihood Estimator (MLE) to
estimate the parameters involved with the
hypothesized distribution. - Suppose that q is the only parameter involve in
the distribution then construct (for example the
mean 1/l in the exponential distribution) - Let L(q) fq (X1) fq (X2) . . . fq(Xn)
- Find q that maximize L(q) to be the required
parameter. - Example the exponential distribution. Do in class
15Determine how representative the fitted
distributions are
- Goodness of Fit (Chi Squared method)
16Goodness of Fit (Chi Square method)
- Divide the range of the fitted distribution into
k (klt30) intervals a0, a1), a1, a2), ak-1,
ak Let Nj the number of data that belong to
aj-1, aj) - Compute the expected proportion of the data that
fall in the jth interval using the fitted
distribution call them pj - Compute the Chi-square
17Chi-square cont.
- Note that npj represents the expected number of
data that would fall in the jth interval if the
fitted distribution is correct. - If
- Where r is the number of parameters in the
distribution (in Exponential dist. r 1 which is
l) - Then do not reject distribution with significance
(1-a)100.
18Example
- Consider the following data
- 0.01, 0.07, 0.03, 0.23, 0.04,
- 0.10, 0.31, 0.10, 0.31, 1.17,
- 1.50, 0.93, 1.54, 0.19, 0.17,
- 0.36, 0.27, 0.46, 0.51, 0.11,
- 0.56, 0.72, 0.39, 0.04, 0.78
- Suppose we hypothesize an exponential
distribution, Use Chi-square test by dividing the
range into 5 subintervals.
19- The estimate of l2.5
- Since k 5, we have pi0.2
- For the exponential distribution
- Therefore
20- Therefore chi-square 0.4
- From the tables of chi-square
- we can accept the hypothesis
- With significance level 5
21The Chi-square table
Probability, p Probability, p Probability, p Probability, p Probability, p Degrees of Freedom
0.001 0.01 0.05 0.95 0.99
10.83 6.64 3.84 0.004 0.000 1
13.82 9.21 5.99 0.103 0.020 2
16.27 11.35 7.82 0.352 0.115 3
18.47 13.28 9.49 0.711 0.297 4
20.52 15.09 11.07 1.145 0.554 5
22.46 16.81 12.59 1.635 0.872 6
24.32 18.48 14.07 2.167 1.239 7
26.13 20.09 15.51 2.733 1.646 8