Title: Discrete Distributions
1Discrete Distributions
2Discrete Distributions
- What is the binomial distribution?
- The binomial distribution is a discrete
probability distribution. It is a distribution
that governs the random variable, X, which is the
number of successes that occur in "n" trials. - The binomial probability distribution gives us
the probability that a success will occur x times
in the n trials, for x 0, 1, 2, , n. - Thus, there are only two possible outcomes. It is
conventional to apply the generic labels
"success" and "failure" to the two possible
outcomes.
3Discrete Distributions
- A success can be defined as anything! "the axle
failed" could be the definition of success in an
experiment testing the strength of truck axles. - Examples
- 1. A coin flip can be either heads or tails
- 2. A product is either good or defective
- Â
- Binomial experiments of interest usually involve
several repetitions or trials of the same basic
experiments. These trials must satisfy the
conditions outlined below
4Discrete Distributions
- When can we use it?
- Condition for use
- Each repetition of the experiment (trial) can
result in only one of two possible outcomes, a
success or failure. See example BD1. - The probability of a success, p, and failure
(1-p) is constant from trial to trial. - All trials are statistically independent i.e. No
trial outcome has any effect on any other trial
outcome. - The number of trials, n, is specified constant
(stated before the experiment begins).
5Discrete Distributions
- Example 1 binomial distribution
- A coin flip results in a heads or tails
- A product is defective or not
- A customer is male or female
- Example 4 binomial distribution
- Say we perform an experiment flip a coin 10
times and observe the result. A successful flip
is designated as heads. - Assuming the coin is fair, the probability of
success is .5 for each of the 10 trials, thus
each trial is independent. - We want to know the number of successes (heads)
in 10 trials. - The random variable that records the number of
successes is called the binomial random variable. - Random variable, x, the number of successes that
occur in the n 10 trials.
6Binomial Random Variable
- Do the 4 conditions of use hold?
- We are not concerned with sequence with the
binomial. We could have several successes or
failures in a row. Since each experiment is
independent, sequence is not important. - The binomial random variable counts the number of
successes in n trials of the binomial experiment. - By definition, this is a discrete random variable.
7Calculating the Binomial Probability
- Rather than working out the binomial
probabilities from scratch each time, we can use
a general formula. - Say random variable "X" is the number of
successes that occur in "n" trials. - Say p probability of success in any trial
- Say q probability failure in any trial where q
(1 p) - In general, The binomial probability is
calculated by
Where x 0, 1, 2, , n
8Calculating the Binomial Probability
9Discrete Distributions
Each pair of values (n, p) determines a distinct
binomial distribution. Two parameters n and p
where a parameter is Any symbol defined in the
functions basic mathematical form such that the
user of that function may specify the value of
the parameter.
10- Developing the Binomial Probability Distribution
P(SSS)p3
S3
P(S3S2,S1)
P(S3)p
S2
P(S2)p
P(S2S1)
S1
P(F3)1-p
F3
P(SSF)p2(1-p)
P(F3S2,S1)
P(S3F2,S1)
S3
P(S3)p
P(SFS)p(1-p)p
P(F2S1)
P(S1)p
P(F2)1-p
P(F3)1-p
Since the outcome of each trial is independent
of the previous outcomes, we can replace the
conditional probabilities with the marginal
probabilities.
F2
P(F3F2,S1)
F3
P(SFF)p(1-p)2
S3
P(S3S2,F1)
P(FSS)(1-p)p2
P(S3)p
S2
P(S2)p
P(F1)1-p
P(S2F1)
P(F3)1-p
P(F3S2,F1)
F3
P(FSF)(1-p)P(1-p)
S3
P(S3F2,F1)
P(FFS)(1-p)2p
F1
P(S3)p
P(F2F1)
P(F2)1-p
F2
P(F3)1-p
P(F3F2,F1)
F3
P(FFF)(1-p)3
11Let X be the number of successes in three
trials. Then,
P(SSS)p3
SSS
P(SSF)p2(1-p)
SS
X 3 X 2 X 1 X 0
P(X 3) p3
P(SFS)p(1-p)p
S S
P(X 2) 3p2(1-p)
P(SFF)p(1-p)2
P(X 1) 3p(1-p)2
P(FSS)(1-p)p2
SS
P(X 0) (1- p)3
P(FSF)(1-p)P(1-p)
P(FFS)(1-p)2p
This multiplier is calculated in the following
formula
P(FFF)(1-p)3
12Binomial Example
- 5 of a catalytic converter production run is
defective. - A sample of 3 converter s is drawn. Find the
probability distribution of the number of
defectives. - Solution
- A converter can be either defective or good.
- There is a fixed finite number of trials (n3)
- We assume the converter state is independent on
one another. - The probability of a converter being defective
does not change from converter to converter
(p.05).
The conditions required for the binomial
experiment are met
13- Let X be the binomial random variable indicating
the number of defectives. - Define a success as a converter is found to be
defective.
X P(X) 0 .8574 1 .1354 2 .0071 3
.0001
14Discrete Distributions
Example The quality control department of a
manufacturer tested the most recent batch of 1000
catalytic converters produced and found that 50
of them to be defective. Subsequently, an
employee unwittingly mixed the defective
converters in with the nondefective ones. Of a
sample of 3 converters is randomly selected from
the mixed batch, what is the probability
distribution of the number of defective
converters in the sample? Does this situation
satisfy the requirements of a binomial
experiment? n 3 trials with 2 possible outcomes
(defective or nondefective). Does the probability
remain the same for each trial? Why or why
not? The probability p of selecting a defective
converter does not remain constant for each trial
because the probability depends on the results of
the previous trial. Thus the trials are not
independent.
15Discrete Distributions
The probability of selecting a defective
converter on the first trial is 50/1000 .05.
If a defective converter is selected on the
first trial, then the probability changes to
49/999 .049. In practical terms, this violation
of the conditions of a binomial experiment is
often considered negligible. The difference would
be more noticeable if we considered 5 defectives
out of a batch of 100.
16Discrete Distributions
If we assume the conditions for a binomial
experiment hold, then consider p .5 for each
trial. Let X be the binomial random variable
indicating the number o defective converters in
the sample of 3. P(X 0) p(0)
3!/0!3!(.05)0(.95)3 .8574 P(X 1) p(1)
3!/1!2!(.05)1(.95)2 .1354 P(X 2) p(2)
3!/2!1!(.05)2(.95)1 .0071 P(X 3) p(3)
3!/3!0!(.05)3(.95)0 .0001 The resulting
probability distribution of the number of
defective converters in the sample of 3, is as
follows
17Discrete Distributions
x p(x) 0 .8574 1 .1354 2 .0071 3 .0001
18Cumulative Binomial Distribution
F(x) S from k0 to x nCx p k q (n-k) Another
way to look at things cummulative
probabilities Say we have a binomial with n 3
and p .05 x p(x) 0 .8574 1 .1354 2 .0071 3
.0001
19Cumulative Binomial Distribution
this could be written in cumulative form from x
0 to x k x p(x) 0 .8574 1 .9928 2 .9999
3 1.000
20Cumulative Binomial Distribution
What is the advantage of cummulative? It allows
us to find the probability that X will assume
some value within a range of values. Example 1
Cumulative p(2) p(xlt2) p(xlt1) .9999 -
.9928 .0071 Example 2 Cumulative Find the
probability of at most 3 successes in n5 trials
of a binomial experiment with p .2. We locate
the entry corresponding to k 3 and p .2 P(X lt
3) SUM p(x) p(0) p(1) p(2) p(3) .993
21Mean and Variance of Binomial Random Variable
- E(X) µ np
- V(X) s2 np(1-p)
- Example 6.10
- Records show that 30 of the customers in a shoe
store make their payments using a credit card. - This morning 20 customers purchased shoes.
- Use the Cummulative Binomial Distribution Table
(A.1 of Appendix) to answer some questions stated
in the next slide.
22- What is the probability that at least 12
customers used a credit card? - This is a binomial experiment with n20 and
p.30.
.01.. 30
0 . . 11
P(At least 12 used credit card)
P(Xgt12)1-P(Xlt11) 1-.995 .005
.995
23- What is the probability that at least 3 but not
more than 6 customers used a credit card?
.01.. 30
0 2 . 6
P(3ltXlt6) P(X3 or 4 or 5 or 6) P(Xlt6)
-P(Xlt2) .608 - .035 .573
.035
.608
24- What is the expected number of customers who used
a credit card? - E(X) np 20(.30) 6
- Find the probability that exactly 14 customers
did not use a credit card. - Let Y be the number of customers who did not use
a credit card.P(Y14) P(X6) P(Xlt6) -
P(xlt5) .608 - .416 .192 - Find the probability that at least 9 customers
did not use a credit card. - Let Y be the number of customers who did not use
a credit card.P(Ygt9) P(Xlt11) .995
25Poisson Distribution
- What if we want to know the number of events
during a specific time interval or a specified
region? - Use the Poisson Distribution.
- Examples of Poisson
- Counting the number of phone calls received in a
specific period of time - Counting the number of arrivals at a service
location in a specific period of time how many
people arrive at a bank - The number of errors a typist makes per page
- The number of customers entering a service
station per hour
26Poisson Distribution
- Conditions for use
- The number of successes that occur in any
interval is independent of the number of
successes that occur in any other interval. - The probability that a success will occur in an
interval is the same for all intervals of equal
size and is proportional to the size of the
interval. - The probability that two or more successes will
occur in an interval approaches zero as the
interval becomes smaller. - Example
- The arrival of individual dinners to a restaurant
would not fit the Poisson model because dinners
usually arrive with companions, violating the
independence condition.
27Poisson Random Variable
- The Poisson variable indicates the number of
successes that occur during a given time interval
or in a specific region in a Poisson experiment - Probability Distribution of the Poisson Random
Variable
- ? average number of successes occurring in a
specific interval - Must determine an estimate for ? from historical
data (or other source) - No limit to the number of values a Poisson random
Variable can assume
28Poisson Probability Distribution With m 1
The X axis in Excel Starts with x1!!
0 1 2 3 4 5
29Poisson probability distribution with m 2
0 1 2 3 4 5 6
Poisson probability distribution with m 5
0 1 2 3 4 5 6 7 8 9
10
Poisson probability distribution with m 7
0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15
30Poisson Example
- Cars arrive at a tollbooth at a rate of 360 cars
per hour. - What is the probability that only two cars will
arrive during a specified one-minute period? (Use
the formula) - The probability distribution of arriving cars for
any one-minute period is Poisson with µ 360/60
6 cars per minute. Let X denote the number of
arrivals during a one-minute period.
31- What is the probability that only two cars will
arrive during a specified one-minute period? (Use
table 2, Appendix B.) - P(X 2) P(Xlt2) - P(Xlt1) .062 - .017 .045
32- What is the probability that at least four cars
will arrive during a one-minute period? Use
Cummulative Poisson Table (Table A.2 , Appendix) - P(Xgt4) 1 - P(Xlt3) 1 - .151 .849
33Poisson Approximation of the Binomial
- When n is very large, binomial probability table
may not be available. - If p is very small (plt .05), we can approximate
the binomial probabilities using the Poisson
distribution. - Use ? np and make the following approximation
With parameters n and p
With m np
34Example of Poisson
Example Poisson Approximation of the Binomial A
warehouse engages in acceptance sampling to
determine if it will accept or reject incoming
lots of designer sunglasses, some of which
invariably are defective. Specifically, the
warehouse has a policy of examining a sample of
50 sunglasses from each lot and accepting the lot
only if the sample contains no more than 2
defective pairs. What is the probability of a
lot's being accepted if, in fact, 2 of the
sunglasses in the lot are defective? This is a
binomial experiment with n 50 and p .02. Our
binomial tables include n values up to 25, but
since p lt .05 and the expected number of
defective sunglasses in the sample is np
50(.02) 1, the required probability can be
approximated by using the Poisson distribution
with µ 1. From Table A.1, we find that the
probability that a sample contains at most 2
defective pairs o sunglasses is .920.
35Poisson Example
- What is the probability of a lot being accepted
if, in fact, 2 of the sunglasses are defective? - Solution
- This is a binomial experiment with n 50, p
.02. - Tables for n 50 are not available plt.05
thus, a Poisson approximation is appropriate ?
(50)(.02) 1 - P(Xpoissonlt2) .920 (true binomial probability
.922)
36Example of Poisson
So how well does the Poisson approximate the
Binomial? Consider the following
table x Binomial (n 50, p .02) Poisson (µ
np 1) 0 .364
.368 1 .372
.368 2 .186
.184 3
.061
.061 4 .014
.015 5 .003
.003 6 .000
.001
37Example of Poisson
- A tollbooth operator has observed that cars
arrive randomly at an average rate of 360 cars
per hour. - Using the formula, calculate the probability that
only 2 cars will arrive during a specified 1
minute period. - Using Table A.2 on page 360, find the probability
that only 2 cars will arrive during a specified 1
minute period. - Using Table A.2 on page 360, find the probability
that at least 4 cars will arrive during a
specified 1 minute period. - P(X2) (e-6)(62) / 2! (.00248)(36) / 2
.0446 - P(X2) P(X lt 2) - P(X lt 1) .062 .017 .045
- P(X gt 4) 1 - P(X lt 3) 1 - .151 .849
38Example of Poisson
What if we wanted to know the probability of a
small number of occurrences in a large number of
trials and a very small probability of
success? We use Poisson as a good approximation
of the answer. When trying to decide between the
binomial and the Poisson, use the following
guidelines n gt 20 n gt 100 or p lt
.05 or np lt 10
39Hypergeometric Distribution
What about sampling without replacement? What is
likely to happen to the probability of
success? Probability of success is not constant
from trial to trial. We have a finite set on N
things, and "a" of them possess a property of
interest. Thus, there are "a" successes in N
things. Let X be the number of successes that
occur in a sample, without replacement of "n"
things from a total of N things. This is the
hypergeometric distribution P(x) (aCx)(N-a C
n-x) x 0,1,2. NCn
40Binomial Approximation of Hypergeometric
Distribution
If N is large, then the probability of success
will remain approximately constant from one trial
to another. When can we use the binomial
distribution as an approximation of the
hypergeometric distribution when N/10 gt n
41Bernoulli Distribution
- What if we want to perform a single experiment
and there are only 2 possible outcomes? - We use a special case of the binomial
distribution where n1 - P(x) 1Cx px (1-p) 1-x x 0,1 which yields
p(0) 1-p - px (1-p) 1-x x0,1 p(1) p
- In this form, the binomial is referred to as the
Bernoulli distribution.
42Geometric Distribution
- Now, instead of being concerned with the number
of successes in n Bernoulli trials, lets
consider the number of Bernoulli trial failures
that would have to be performed prior to
achieving the 1st success. - In this case, we use the geometric distribution,
where X is the random variable representing the
number of failures before the 1st success. - Mathematical form of geometric distribution
- P(x)p (1-p)x x 0,1,2
43Negative Binomial Distribution
- What if we wanted to know the number of failures,
x, that occur before the rth success (r 1,2.)? - In this case, we use the negative binomial
distribution. - The number of statistically independent trials
that will be performed before the r success X
r - The previous r 1 successes and the X failures
can occur in any order during the X r 1
trials. - Negative binomial distribution mathematical
form - P(x) rx-1 Cx pr (1-p)x x 0,1,2..
44Problem Solving
- A Suggestion for Solving Problems Involving
Discrete Random Variables - An approach
- Understand the random variable under
consideration and determine if the random
variable fits the description and satisfies the
assumptions associated with any of the 6 random
variables presented in Table 3.3. - If you find a match in Table 3.3 use the software
for that distribution. - If none of the 6 random variables in table 3.3
match the random variable associated with your
problem, use the sample space method
45Discrete Bivariate Probability Distribution
Functions
- The first part of the chapter considered only
univariate probability distribution functions.
One variable is allowed to change. - What if two or more variables change?
- When considering situations where two or more
variables change, the definitions of - sample space
- numerically valued functions
- random variable
- still apply.
46Bivariate Distributions
- To consider the relationship between two random
variables, the bivariate (or joint) distribution
is needed. - Bivariate probability distribution
- The probability that X assumes the value x, and Y
assumes the value y is denoted - p(x,y) P(Xx, Y y)
47Discrete Bivariate Probability Distribution
Functions
Example Consider the following real estate
data We want to know how the size of the house
varies with the cost.
48Discrete Bivariate Probability Distribution
Functions
49Discrete Bivariate Probability Distribution
Functions
What is the next step? Construct frequency
distributions Frequency distribution of house
size
50Discrete Bivariate Probability Distribution
Functions
Frequency distribution of selling price
51Discrete Bivariate Probability Distribution
Functions
What next? Construct a bivariate frequency
distribution of X and Y
52Discrete Bivariate Probability Distribution
Functions
A Bivariate (or joint) probability distribution
of X and Y is a table that gives the joint
probabilities p(x,y) for all pairs of values
(x,y). Cumulative bivariate probability
distribution function is F(x1, x2) P(X1 lt x1
? X2 lt x2) Marginal probability distribution
function of X , p (x ) Is a univariable
probability distribution function Px1 (X1) Sum
p(x1, x2) Px2 (X2) Sum p(x1, x2)
53Discrete Bivariate Probability Distribution
Functions
- Example Xavier and Yvette are two real estate
agents. Let X denote the number of houses that
Xavier will sell in a week, and let Y denote the
number of houses that Yvette will sell in a week.
Suppose that the joint probability distribution
of X and Y is as shown in the table below.
54Discrete Bivariate Probability Distribution
Functions
- Notice that a bivariate probability distribution
is similar to a bivariate frequency distribution,
with the probabilities having replaced the
frequencies. - The none probabilities in the interior of the
table are the joint probabilities p(x,y). - Â
- p(0,0) P(X 0 and Y 0) .12
- p(0,1) P(X 0 and Y 1) .21
- p(0,2) P(X 0 and Y 2) .07
55Discrete Bivariate Probability Distribution
Functions
- Summing these three probabilities, we obtain the
marginal probability P(X 0) .40 (named
marginal because it appears in the margin of the
table) - Summing the probabilities in each of the other
columns and rows, we obtain the other marginal
probabilities. - Thus the marginal probability distributions of X
and Y are
56Example
- The bivariate (joint) probability distribution
p(0,0)
P(Y1), the marginal probability.
p(0,1)
p(0,2)
P(X0) The marginal probability
570.42
x p(x) y p(y) 0
.4 0 .6 1 .5
1 .3 2 .1 2
.1 E(X) .7 E(Y) .5 V(X) .41
V(Y) .45
p(x,y)
0.21
0.12
0.06
X
y0
0.06
0.03
0.07
0.02
y1
0.01
Y
y2
X0
X2
X1
58Calculating Conditional Probability
Example - continued
59Conditions For Independence
- Two random variables are said to be independent
when - This leads to the following relationship for
independent variables - Example 6.7 - continued
- Since P(X0Y1).7 but P(X0).4, The variables
X and Y are not independent.
P(XxYy)P(Xx) or P(YyXx)P(Yy).
P(Xx and Yy) P(Xx)P(Yy)
60- Additional example
- The table below represent the joint probability
distribution of the variable X and Y. Are the
variables X and Y independent
Find the marginal probabilities of X and Y. Then
apply the multiplication rule.
p(y) .7 .3
P(y) .40 .60
Compare the other two pairs. Yes, the two
variables are independent
61 The sum of Two Variables
- To calculate the probability distribution for a
sum of two variables X and Y observe the example
below. - Example 6.7 - continued
- Find the probability distribution of the total
number of houses sold per week by Xavier and
Yvette. - Solution
- XY is the total number of houses sold. XY can
have the values 0, 1, 2, 3, 4. - We find the distribution of XY as demonstrated
next.
62P(XY0) P(X0 and Y0) .12
P(XY1) P(X0 and Y1) P(X1 and Y0) .21
.42 .63
The probabilities P(XY)3 and P(XY) 4 are
calculated the same way. The distribution
follows
P(XY2) P(X0 and Y2) P(X1 and Y1) P(X2
and Y0) .07 .06 .06 .19
63Expected value and variance of XY
- When the distribution of XY is known (see the
previous example) we can calculate E(XY) and
V(XY) directly using their definitions. - An alternative is to use the relationships
- E(aXbY) aE(X) bE(Y)
- V(aXbY) a2V(X) b2V(Y) if X and Y are
independent. - When X and Y are not independent, (see the
previous example) we need to incorporate the
covariance in the calculations of the variance
V(aXbY).