Title: Known Probability Distributions
1Known Probability Distributions
- Engineers frequently work with data that can be
modeled as one of several known probability
distributions. - Being able to model the data allows us to
- model real systems
- design
- predict results
- Key discrete probability distributions include
- binomial / multinomial
- negative binomial
- hypergeometric
- Poisson
2Discrete Uniform Distribution
- Simplest of all discrete distributions
- All possible values of the random variable have
the same probability, i.e., - f(x k) 1/ k, x x1 , x2 , x3 , , xk
- Expectations of the discrete uniform distribution
-
3Binomial Multinomial Distributions
- Bernoulli Trials
- Inspect tires coming off the production line.
Classify each as defective or not defective.
Define success as defective. If historical data
shows that 95 of all tires are defect-free, then
P(success) 0.05. - Signals picked up at a communications site are
either incoming speech signals or noise. Define
success as the presence of speech. P(success)
P(speech) - Administer a test drug to a group of patients
with a specific condition. P(success)
___________ - Bernoulli Process
- n repeated trials
- the outcome may be classified as success or
failure - the probability of success (p) is constant from
trial to trial - repeated trials are independent.
4Binomial Distribution
- Example
- Historical data indicates that 10 of all bits
transmitted through a digital transmission
channel are received in error. Let X the number
of bits in error in the next 4 bits transmitted.
Assume that the transmission trials are
independent. What is the probability that - Exactly 2 of the bits are in error?
- At most 2 of the 4 bits are in error?
- more than 2 of the 4 bits are in error?
- The number of successes, X, in n Bernoulli trials
is called a binomial random variable.
5Binomial Distribution
- The probability distribution is called the
binomial distribution. - b(x n, p) , x 0, 1, 2, , n
- where p _________________
- q _________________
- For our example,
- b(x n, p) _________________
6For Our Example
- What is the probability that exactly 2 of the
bits are in error? - At most 2 of the 4 bits are in error?
7Your turn
- What is the probability that more than 2 of the 4
bits are in error?
8Expectations of the Binomial Distribution
- The mean and variance of the binomial
distribution are given by - µ np
- s2 npq
- Suppose, in our example, we check the next 20
bits. What are the expected number of bits in
error? What is the standard deviation? - µ ___________
- s2 __________ , s __________
9Another example
- A worn machine tool produces 1 defective parts.
If we assume that parts produced are independent,
what is the mean number of defective parts that
would be expected if we inspect 25 parts? - What is the expected variance of the 25 parts?
10Helpful Hints
- Sometimes it helps to draw a picture.
- Suppose we inspect the next 5 parts
- P(at least 3) ?
- P(2 X 4) ?
- P(less than 4) ?
- Appendix Table A.1 (pp. 742-747) lists Binomial
Probability Sums, ?rx0b(x n, p)
11Your turn
- Use Table A.1 to determine
- 1. b(x 15, 0.4) , P(X 8) ______________
- 2. b(x 15, 0.4) , P(X lt 8) ______________
- 3. b(x 12, 0.2) , P(2 X 5) ___________
- 4. b(x 4, 0.1) , P(X gt 2) ______________
12Multinomial Experiments
- What if there are more than 2 possible outcomes?
(e.g., acceptable, scrap, rework) - That is, suppose we have
- n independent trials
- k outcomes that are
- mutually exclusive (e.g., ?, ?, ?, ?)
- exhaustive (i.e., ?all k pi 1)
- Then
- f(x1, x2, , xk p1, p2, , pk, n)
13Example
- Look at problem 5.22, pg. 152
- f( __, __, __ ___, ___, ___, __)
_________________ - __________________________________
x1 _______ p1 _______
x2 _______ p2 _______ n _____
x3 _______ p3 _______
14Hypergeometric Distribution
- Example
- Automobiles arrive in a dealership in lots of
10. Five out of each 10 are inspected. For one
lot, it is know that 2 out of 10 do not meet
prescribed safety standards. - What is probability that at least 1 out of the 5
tested from that lot will be found not meeting
safety standards? - from Complete Business Statistics, 4th ed
(McGraw-Hill)
15- This example follows a hypergeometric
distribution - A random sample of size n is selected without
replacement from N items. - k of the N items may be classified as successes
and N-k are failures. - The probability associated with getting x
successes in the sample (given k successes in the
lot.) - Where,
- k number of successes 2 n number in
sample 5 - N the lot size 10 x number found
- 1 or 2
16Hypergeometric Distribution
- In our example,
- _____________________________
17Expectations of the Hypergeometric Distribution
- The mean and variance of the hypergeometric
distribution are given by -
- What are the expected number of cars that fail
inspection in our example? What is the standard
deviation? - µ ___________
- s2 __________ , s __________
18Your turn
- A worn machine tool produced defective parts for
a period of time before the problem was
discovered. Normal sampling of each lot of 20
parts involves testing 6 parts and rejecting the
lot if 2 or more are defective. If a lot from the
worn tool contains 3 defective parts - What is the expected number of defective parts in
a sample of six from the lot? - What is the expected variance?
- What is the probability that the lot will be
rejected?
19Binomial Approximation
- Note, if N gtgt n, then we can approximate this
with the binomial distribution. For example - Automobiles arrive in a dealership in lots of
100. 5 out of each 100 are inspected. 2 /10
(p0.2) are indeed below safety standards. - What is probability that at least 1 out of 5
will be found not meeting safety standards? - Recall P(X 1) 1 P(X lt 1) 1 P(X 0)
Hypergeometric distribution Binomial distribution
(Compare to example 5.15, pg. 155)
20Negative Binomial Distribution
- Example
- Historical data indicates that 30 of all bits
transmitted through a digital transmission
channel are received in error. An engineer is
running an experiment to try to classify these
errors, and will start by gathering data on the
first 10 errors encountered. - What is the probability that the 10th error will
occur on the 25th trial?
21- This example follows a negative binomial
distribution - Repeated independent trials.
- Probability of success p and probability of
failure q 1-p. - Random variable, X, is the number of the trial on
which the kth success occurs. - The probability associated with the kth success
occurring on trial x is given by, - Where,
- k success number 10
- x trial number on which k occurs 25
- p probability of success (error) 0.3
- q 1 p 0.7
22Negative Binomial Distribution
- In our example,
- _____________________________
23Geometric Distribution
- Example
- In our example, what is the probability that the
1st bit received in error will occur on the 5th
trial? - This is an example of the geometric distribution,
which is a special case of the negative binomial
in which k 1. - The probability associated with the 1st success
occurring on trial x is given by - __________________________________
24Your turn
- A worn machine tool produces 1 defective parts.
If we assume that parts produced are independent - What is the probability that the 2nd defective
part will be the 6th one produced? - What is the probability that the 1st defective
part will be seen before 3 are produced? - How many parts can we expect to produce before we
see the 1st defective part? (Hint see Theorem
5.4, pg. 161)
25Poisson Process
- The number of occurrences in a given interval or
region with the following properties - memoryless
- P(occurrence) during a very short interval or
small region is proportional to the size of the
interval and doesnt depend on number occurring
outside the region or interval. - P(Xgt1) in a very short interval is negligible
26Poisson Process
- Examples
- Number of bits transmitted per minute.
- Number of calls to customer service in an hour.
- Number of bacteria in a given sample.
- Number of hurricanes per year in a given region.
27Poisson Process
- Example
- An average of 2.7 service calls per minute are
received at a particular maintenance center. The
calls correspond to a Poisson process. To
determine personnel and equipment needs to
maintain a desired level of service, the plant
manager needs to be able to determine the
probabilities associated with numbers of service
calls. - What is the probability that fewer than 2 calls
will be received in any given minute?
28Poisson Distribution
- The probability associated with the number of
occurrences in a given period of time is given
by, - Where,
- ? average number of outcomes per unit time or
region 2.7 - t time interval or region 1 minute
29Our Example
- The probability that fewer than 2 calls will be
received in any given minute is - P(X lt 2) P(X 0) P(X 1)
- __________________________
- The mean and variance are both ?t, so
- µ _____________________
- Note Table A.2, pp. 748-750, gives St p(xµ)
30Poisson Distribution
- If more than 6 calls are received in a 3-minute
period, an extra service technician will be
needed to maintain the desired level of service.
What is the probability of that happening? - µ ?t _____________________
- P(X gt 6) 1 P(X lt 6)
- _____________________
31Poisson Distribution
32Poisson Distribution
- The effect of ? on the Poisson distribution
33Continuous Probability Distributions
- Many continuous probability distributions,
including - Uniform
- Normal
- Gamma
- Exponential
- Chi-Squared
- Lognormal
- Weibull
34Uniform Distribution
- Simplest characterized by the interval
endpoints, A and B. - A x B
- 0 elsewhere
- Mean and variance
- and
35Example
- A circuit board failure causes a shutdown of a
computing system until a new board is delivered.
The delivery time X is uniformly distributed
between 1 and 5 days. - What is the probability that it will take 2 or
more days for the circuit board to be delivered?
36Normal Distribution
- The bell-shaped curve
- Also called the Gaussian distribution
- The most widely used distribution in statistical
analysis - forms the basis for most of the parametric tests
well perform later in this course. - describes or approximates most phenomena in
nature, industry, or research - Random variables (X) following this distribution
are called normal random variables. - the parameters of the normal distribution are µ
and s (sometimes µ and s2.)
37Normal Distribution
- The density function of the normal random
variable X, with mean µ and variance s2, is - all x.
38Standard Normal RV
- Note the probability of X taking on any value
between x1 and x2 is given by - To ease calculations, we define a normal random
variable - where Z is normally distributed with µ 0 and
s2 1
39Standard Normal Distribution
- Table A.3 Areas Under the Normal Curve
40Examples
- P(Z 1)
- P(Z -1)
- P(-0.45 Z 0.36)
41Your turn
- Use Table A.3 to determine (draw the picture!)
- 1. P(Z 0.8)
- 2. P(Z 1.96)
- 3. P(-0.25 Z 0.15)
- 4. P(Z -2.0 or Z 2.0)
42The Normal Distribution In Reverse
- Example
- Given a normal distribution with µ 40 and s
6, find the value of X for which 45 of the area
under the normal curve is to the left of X. - If P(Z lt k) 0.45,
- k ___________
- Z _______
- X _________
-
43Normal Approximation to the Binomial
- If n is large and p is not close to 0 or 1,
- or
- if n is smaller but p is close to 0.5, then
- the binomial distribution can be approximated by
the normal distribution using the transformation - NOTE add or subtract 0.5 from X to be sure the
value of interest is included (draw a picture to
know which) - Look at example 6.15, pg. 191
44Look at example 6.15, pg. 191
- p 0.4 n 100
- µ ____________ s ______________
- if x 30, then z _____________________
- and, P(X lt 30) P (Z lt _________) _________
45Your Turn
DRAW THE PICTURE!!
- Refer to the previous example,
- What is the probability that more than 50
survive? - What is the probability that exactly 45 survive?
46Gamma Exponential Distributions
- Recall the Poisson Process
- Number of occurrences in a given interval or
region - Memoryless process
- Sometimes were interested in the time or area
until a certain number of events occur. - For example
- An average of 2.7 service calls per minute are
received at a particular maintenance center. The
calls correspond to a Poisson process. -
- What is the probability that up to a minute will
elapse before 2 calls arrive? - How long before the next call?
47Gamma Distribution
- The density function of the random variable X
with gamma distribution having parameters a
(number of occurrences) and ß (time or region). - x gt 0.
- µ aß
- s2 aß2
48Exponential Distribution
- Special case of the gamma distribution with a
1. - x gt 0.
- Describes the time until or time between Poisson
events. - µ ß
- s2 ß2
49Example
- An average of 2.7 service calls per minute are
received at a particular maintenance center. The
calls correspond to a Poisson process. -
- What is the probability that up to a minute will
elapse before 2 calls arrive? - ß ________ a ________
-
- P(X 1) _________________________________
50Example (cont.)
- What is the expected time before the next call
arrives? - ß ________ a ________
-
- µ _________________________________
51Your turn
- Look at problem 6.40, page 205.
52Chi-Squared Distribution
- Special case of the gamma distribution with a
?/2 and ß 2. - x gt 0.
- where ? is a positive integer.
- single parameter,? is called the degrees of
freedom. - µ ?
-
- s2 2?
EGR 252 Ch. 6
52
53Lognormal Distribution
- When the random variable Y ln(X) is normally
distributed with mean µ and standard deviation s,
then X has a lognormal distribution with the
density function, -
EGR 252 Ch. 6
53
54Example
- Look at problem 6.72, pg. 207
- Since ln(X) has normal distribution with µ 5
and s 2, the probability that X gt 50,000 is, - P(X gt 50,000) __________________________
EGR 252 Ch. 6
54
55Wiebull Distribution
- Used for many of the same applications as the
gamma and exponential distributions, but - does not require memoryless property of the
exponential -
EGR 252 Ch. 6
55
56Example
- Designers of wind turbines for power generation
are interested in accurately describing
variations in wind speed, which in a certain
location can be described using the Weibull
distribution with a 0.02 and ß 2. A
designer is interested in determining the
probability that the wind speed in that location
is between 3 and 7 mph. - P(3 lt X lt 7) ___________________________
EGR 252 Ch. 6
56
57Populations and Samples
- Population a group of individual persons,
objects, or items from which samples are taken
for statistical measurement - Sample a finite part of a statistical
population whose properties are studied to gain
information about the whole
(Merriam-Webster Online Dictionary,
http//www.m-w.com/, October 5, 2004)
58Examples
- Population
- Students pursuing undergraduate engineering
degrees - Cars capable of speeds in excess of 160 mph.
- Potato chips produced at the Frito-Lay plant in
Kathleen - Freshwater lakes and rivers
59Basic Statistics (review)
- 1. Sample Mean
- Example
- At the end of a team project, team members were
asked to give themselves and each other a grade
on their contribution to the group. The results
for two team members were as follows - ___________________
- ___________________
Q S
92 85
95 88
85 75
78 92
60Basic Statistics (review)
- 1. Sample Variance
- For our example
- SQ2 ___________________
- SS2 ___________________
Q S
92 85
95 88
85 75
78 92
61Your Turn
- Work in groups of 4 or 5. Find the mean,
variance, and standard deviation for your group
of the (approximate) number of hours spent
working on homework each week.
62Sampling Distributions
- If we conduct the same experiment several times
with the same sample size, the probability
distribution of the resulting statistic is called
a sampling distribution - Sampling distribution of the mean if n
observations are taken from a normal population
with mean µ and variance s2, then
63Central Limit Theorem
- Given
- X the mean of a random sample of size n taken
from a population with mean µ and finite variance
s2, - Then,
- the limiting form of the distribution of
- is _________________________
64Central Limit Theorem
- If the population is known to be normal, the
sampling distribution of X will follow a normal
distribution. - Even when the distribution of the population is
not normal, the sampling distribution of X is
normal when n is large. - NOTE when n is not large, we cannot assume the
distribution of X is normal.
65Example
- The time to respond to a request for information
from a customer help line is uniformly
distributed between 0 and 2 minutes. In one month
48 requests are randomly sampled and the response
time is recorded. - What is the probability that the average
response time is between 0.9 and 1.1 minutes? - µ ______________ s2 ________________
- µX __________ sX2 ________________
- Z1 _____________ Z2 _______________
- P(0.9 lt X lt 1.1) _____________________________
66Sampling Distribution of the Difference Between
two Averages
- Given
- Two samples of size n1 and n2 are taken from two
populations with means µ1 and µ2 and variances
s12 and s22 - Then,
67Sampling Distribution of S2
- Given
- S2 is the variance of of a random sample of size
n taken from a population with mean µ and finite
variance s2, - Then,
- has a ?2 distribution with ? n - 1
68?2 Distribution
- ?a2 represents the ?2 value above which we find
an area of a, that is, for which P(?2 gt ?a2 ) a.
69Example
- Look at example 8.10, pg. 256
- µ 3 s 1 n 5
- s2 ________________
- ?2 __________________
- If the ?2 value fits within an interval that
covers 95 of the ?2 values with 4 degrees of
freedom, then the estimate for s is reasonable. - (See Table A.5, pp. 755-756)
70Your turn
- If a sample of size 7 is taken from a normal
population (i.e., n 7), what value of ?2
corresponds to P(?2 lt ?a2) 0.95? (Hint first
determine a.)
71t- Distribution
- Recall, by CLT
- is n(z 0,1)
- Assumption _____________________
-
- (Generally, if an engineer is concerned with a
familiar process or system, this is reasonable,
but )
72What if we dont know s?
- New statistic
- Where,
- and
- follows a t-distribution with ? n 1 degrees
of freedom.
73Characteristics of the t-Distribution
- Look at fig. 8.13, pg. 259
- Note
- Shape _________________________
- Effect of ? __________________________
- See table A.4, pp. 753-754
74Using the t-Distribution
- Testing assumptions about the value of µ
- Example problem 8.52, pg. 265
- What value of t corresponds to P(t lt ta) 0.95?
75Comparing Variances of 2 Samples
- Given two samples of size n1 and n2, with sample
means X1 and X2, and variances, s12 and s22 -
- Are the differences we see in the means due to
the means or due to the variances (that is, are
the differences due to real differences between
the samples or variability within each samples)? -
- See figure 8.16, pg. 262
76F-Distribution
- Given
- S12 and S22, the variances of independent random
samples of size n1 and n2 taken from normal
populations with variances s12 and s22,
respectively, - Then,
- has an F-distribution with ?1 n1 - 1 and ?2
n2 1 degrees of freedom. - (See table A.6, pp. 757-760)
77Example
- Problem 8.55, pg. 266
- S12 ___________________
- S22 ___________________
- F _____________ f0.05 (4, 5) _________
- NOTE