Title: Chapter 6: Discrete Probability Distributions
1Chapter 6 Discrete Probability Distributions
6.1 Discrete Random Variables 6.2 The Binomial
Probability Distribution 6.3 The Poisson
Probability Distribution
1
November 12, 2008
2Discrete Random Variables
Consider the probability experiment of flipping a
coin two times. The possible outcomes of this
experiment, the sample space, is S
HH,HT,TH,TT Where H is a head and T is a
tail. We call the outcome of such an
experiment a random event since we are never sure
of its value.
Mathematicians and statisticians like to work
with numbers or things that can be represented by
numbers. For this reason, we try to assign the
random event a number or a set of numbers.
Section 6.1
3Random Variable
Definition A random variable is a numerical
measure or representation of a random event in a
probability experiment.
Remark Since a random variable comes from
something that is random (a random event), it may
take several numerical values. We will denote
the random variable by using capital letters,
e.g., X, its values by small letter variables,
e.g., x.
Example Flipping a coin twice. S
HH,HT,TH,TT. Let X be the random variable that
represents the number of tails. Then the number
of values assume by X are x 0,1,2.
4Examples
Example Consider the probability experiment of
measuring your systolic blood pressure. This is
a random variable X and its range of values is 50
to 230 mm Hg. For example, it is normal if x
120. We call X a continuous random variable
since it can take any value in the interval
50,230.
Example Consider the probability experiment of
rolling once a single die. The random variable X
in this case is that a particular number that
comes up. The possible values for this random
variable are x 1, 2, 3, 4, 5 or 6. We call X
a discrete random variable since it can take only
discrete values.
5Random Variable Types
- A discrete random variable has a finite or
countable number of values. That is, the values
can be put into a 1-1 correspondence with the
integers or a subset of the integers. - A continuous random variable has an infinite
number of possible values that have a 1-1
corresponds with number in an interval a,b.
6Examples
- Let X be the random variable of the number of
students in Math 127A that will receive As in
the course. Here, x 0,1,2,.,132. This is a
discrete random variable. - Let X be the random variable for the weight of
students in Math 127A. Here, x gt 0 and this is a
continuous random variable.
7Discrete Probability Distributions
Definition Let X be a discrete random variable.
The discrete probability distribution of X, which
we denote by P, is a function that maps the
values of the discrete random X into the interval
0,1. P is often represented by a graph or
table that gives all of the possible values of X
(i.e., x) and the corresponding probabilities of
x, P(x). Sometimes it is given as a mathematical
formula.
x P(x)
--- ---
--- ---
--- ---
--- ---
8Example
Suppose that we perform a probability experiment
of flipping a coin 3 times in a row. Let X be
the random variable of the number of times that
head appears. The possible values for X are x
0,1,2,3. Using a simulation of flipping the coin
3 times, we find the following
x P(x)
0 0.10
1 0.01
2 0.51
3 0.38
- This is a discrete probability distribution for
X. Notice that - 0 P(x) 1
- P(0) P(1) P(2) P(3) 1
9Rules for Discrete Probability Distributions
Remark If any of the above conditions is
violated, then P is not a discrete probability
distribution.
10Examples
Which of the following are discrete probability
distributions?
x P(x)
1 0.40
2 0.35
3 0.12
4 -0.07
5 0.20
x P(x)
1 0.40
2 0.35
3 0.12
4 0.01
5 0.20
x P(x)
1 0.40
2 0.35
3 0.12
4 0.01
5 0.12
No No Yes
11Example
Consider the following discrete probability
distribution for the number of homeruns in a
single game by a Boston Red Sox team. Let X be
the random variable which is the number of HR hit
in a single game.
x ( of HR) P(x)
0 0.23
1 0.38
2 0.22
3 0.13
4 0.03
5 0.01
6 or more 0.00
12Question What is the probability that the team
will hit three or more homeruns in a single
game? Let A event that they will hit 3 HR in a
game. Let B event that they will hit 4 HR in a
game. Let C event that they will hit 5 HR in a
game.. Let D event that they will hit 6 or more
HR in a game. Then P(A or B or C or D) P(A)
P(B) P(C) P(D) 0.13 0.03
0.01 0.00 0.17 (17) Note A, B, C, D are
disjoint events.
13Remark
14Graphical Representations of Discrete Probability
Distributions
If a discrete probability distribution is given
as a probability table
x P(x)
--- ---
--- ---
--- ---
--- ---
then we construct a histogram (relative
frequencies) from this table. This is called a
probability histogram.
15Example
In the 2004 baseball season, Ichiro Suzuki of the
Settle Mariners set the record for the most hits
in a season with a total of 262 hits. Let X be
the discrete random variable for the number of
hits per game. The following table gives the
probabilities of the number of hits per game by
Suzuki. Here the number of hits per game is the
random variable and the sum of the probabilities
is 1. Construct the probability histogram.
x P(x)
0 0.1677
1 0.3354
2 0.2857
3 0.1491
4 0.0373
5 or more 0.0248
16Parameters of a Discrete Probability Distribution
- Recall A parameter of a population is a
numerical characteristic of the population (e.g.,
mean, standard deviation, etc.). - Definition A parameter of a discrete probability
distribution is a number that summarizes a
characteristic of the distribution. - Parameters
- Mean
- Standard Deviation
17The Mean of a Discrete Probability Distribution
x P(x)
0 0.10
1 0.01
2 0.51
3 0.38
18Mean - Mean?
Recall that we had defined the mean of a set of
numbers (population) in a different way.
Question Is the mean of a population and the
mean of a discrete probability distribution
really the same thing? We consider the following
example to show that they are the same.
19Example
Suppose that we consider a group of people and we
ask them how many hours of TV have they watched
during the past 24 hours. Suppose that there
responses are summarized in the list
2,4,6,6,4,4,2,3,5,5. The mean of this set
(population) is (2466442355)/10 41/10
4.1. We can set up a probability table from
this data consider the random variable X to be
the number of hours of TV watching. Hence, x
2,3,4,5,6.
x P(x)
2 2/10
3 1/10
4 3/10
5 2/10
6 2/10
The mean of this discrete probability
distribution is ?X 2(0.2) 3(0.1) 4(0.3)
5(0.2) 6(0.2) 4.1 and hence, we compute the
same value! This is not by chance.
20Expected Value of X
The mean of a discrete probability distribution
for X is a weight average of its probabilities
P(xi) where xi are the weights. In a way, we can
think of it as the value of P(X). This mean is
the the average of all of the possible outcomes
of X. For this reason it is also called the
expected value of X. We can think of ?X as the
mean outcome of the all events in X. That is, if
we were to repeat the probability experiment
many, many times, then the average of the
outcomes would approach the expected value of X.
21Example
x ( of HR) P(x)
0 0.23
1 0.38
2 0.22
3 0.13
4 0.03
5 0.01
6 or more 0.00
Find the expected value of X where X is the
random variable for the number of homeruns in a
single game by the Boston Red Sox baseball team.
?X 0(0.23) 1(0.38) 2(0.22) 3(0.13)
4(0.03) 5(0.01) 6(0.00) 1.38
22Example
Suppose that you want to invest 100 in the stock
market. Let X be the random variable for the
results of your 100 investment. For simplicity,
suppose that there are two possible outcomes x
0 or 1000 such that P(0) 0.50 and P(1000)
0.50. What is the expected value of your
investment? ?X 0(0.5) 1000(0.5) 500 i.e.,
we have an expected return of 500.
23Standard Deviation of a Discrete Probability
Distribution
The standard deviation of a probability
distribution is a measure of its spread. In other
words, it is the variation in xi from its mean,
weighted by the probabilities.
24Example
Find the mean and standard deviation for the
following discrete probability distribution.
x P(x)
0 0.25
1 0.25
2 0.10
4 0.40
?x 0(0.25) 1(0.25) 2(0.1) 4(0.40)
2.05 ?X 02(0.25)12(0.25)22(0.10)42(0.4) -
(2.05)21/2 1.69
25Comparing Discrete Probability Distributions
? 2.18 ? 0.64
? 1.50 ? 1.12
26The Binomial Probability Distribution
The binomial probability distribution is a
discrete probability distribution that is used to
determine probabilities of events in a
probability experiment in which there are two
mutual exclusive (disjoint) outcomes are
possible. For example, the probability
experiment might be flipping a coin. The two
mutually exclusive outcomes are head or
tails. Another example, the probability
experiment might be one of asking an individual
if he has a drivers license. He or she does
or doesnt.
Section 6.2
27Binomial Probability Experiment
- Binomial Probability Experiment
- The experiment is performed a n times with each
repetition of the experiment being a trial. - The trials are independent.
- Each trial results in one of two mutually
exclusive outcomes (success or failure). - The probability of a success is p and the
probability of failure is 1 - p, e.g., p 0.5. - A random variable for this type of probability
experiment is the number of successes in the n
trials. .
28The Binomial Random Variable
- We consider a binomial probability experiment.
- For each of the n trials of a binomial
probability experiment, the probability of
success is p. - For each of the n trials of a binomial
probability experiment, the probability of
failure is 1 - p. - Each of the n trials are independent i.e., the
outcome of one trial does not dependent on any
other trial. - Let X be the number of successes in the n trials.
We call X a binomial random variable.
29Binomial Probability Distribution Function
30Example
- According to the Uniform Crime Report, 2003,
66.9 of murders in the the U.S. were committed
with a firearm. - If 100 murders are randomly selected, how many
would you expect be committed with a firearm? - (b) What is the probability that you would
observe 75 murders out of the 100 randomly
selected murders?
31Example
- Probability Experiment Does a person have ESP?
- A person in one room picks one the integers
(1-5) at random and thinks about this particular
number for 1 minute. - In another room, the person who claims to have
ESP tries to identify the number that was chosen
by the person in the first room. - The is done 3 times (3 trials). We assume that
each trial is independent. - The ESP claimant has the correct answer twice
i.e., he or she is correct two out the three
trials. - Question Does the person have ESP?
- Answer If the person has ESP, then their success
rate would be greater than guessing the number
each trial. Hence, we must compute the
probability that one could guess 2 out of the 3
correct answers?
32Possible Outcomes of Guessing Three Times (a
compound event) S success and F failure to
guess correctly. The probability of guessing the
correct number is 1/5 0.20 .Hence, P(S) 0.2
and P(F) 1 - P(S) 0.8 .
Different Possible Outcomes
The probability of having any two successes is
0.032 from the following If A SSF, B SFS, C
FSS, then P(A or B or C) P(A) P(B) P(C)
3(0.032) 0.096 since A, B and C are disjoint
events. Hence, this will happen approximately 10
of the time. Hence, the ESP claimant has a
success rate that is much higher than one would
expect from guessing.
Outcome Probability
SSS (0.2)(0.2)(0.2) 0.008
SSF (0.2)(0.2)(0.8) 0.032
SFS (0.2)(0.8)(0.2) 0.032
FSS (0.8)(0.2)(0.2) 0.032
SFF (0.2)(0.8)(0.8) 0.128
FSF (0.8)(0.2)(0.8) 0.128
FFS (0.8)(0.8)(0.2) 0.128
FFF (0.8)(0.8)(0.8) 0.512
33Calculation using Binomial Distribution Function
34Binomial Distributions for n 10 and Different p
35Mean and Standard Deviation of Binomial
Distributions
36Example
Example Suppose the a random variable is
distributed by the binomial distribution with n
12 and p 0.65. Calculate the probability that
x 10. Find the mean and standard deviation.
37Calculation on TI-83
- 2nd VARS (DISTR) key
- Select binompdf( ENTER
- Complete entry e.g., binompdf(n,p,x) ENTER
- Remark
- In binompdf, if x is omitted, then it calculates
the Binomial Distribution for all of the possible
values of x. - The CDF of the Binomial Distribution can be
calculated with binomcdf(n,p,xk,xk1,,xm)
38Example
- Question Are women fairly treated in the
selection for managerial training? - Situation
- A pool of 1,000 employees from which 10 will be
selected. - Of the 1,000 employees, 50 are women and 50
are men. - Result None of the 10 selected employees for
management training were women. - Question Does this show bias against women?
- Analysis Assuming that the 1,000 employees are
equally qualified for training and there is equal
chance of selecting woman or a man, what is the
probability of selecting 10 male trainees? - In the long-run, we would expect that for a
sample of 10, 5 would be women and 5 would be
men. We will assume that the 10 selected
employees are distributed according to a Binomial
Distribution (n 10, p 0.5). What does the
probability distribution look like? Let success
be the outcome of picking a female employee.
39x P(x)
0 0.00097
1 0.00975
2 0.04394
3 0.11718
4 0.20507
5 0.24609
6 0.20507
7 0.11718
8 0.04394
9 0.00976
10 0.00097
P(0) 0.00097 which is very small and hence,
picking no women is very unlikely. Note that the
highest probability is when x 5 i.e, P(5)
0.24609. The expect value is (n)(p) (10)(0.5)
5.
40Example
- Question Can we check racial profiling by the
police? - Situation
- Statistics from Philadelphia Police Department
in 1997 - 262 car stops
- Result 207 of 262 involve African Americans
i.e., 79 of total number of stops. - Analysis Assume that the car stops are
binomially distributed with n 262. In
Philadelphia, African Americans (AA) make up
42.2 of the population. Hence, we would expect
that approximately 4 out 10 car stops would
involve AA and we choose p 0.422 . Let X be
the binomial random variable be the event of
stopping an AA. For the binary outcome, success
is stopping of a car driven by an AA. - Mean ?X (p)(n) (0.422)(262) 110.564
- Standard Deviation ?X (n)(p)(1-p)1/2
(110.564)(0.57801/2 9.275
41P(207) 3.9 x 10-34 This is probability of
stopping 207 AA out of the 262 stops. We would
expect approximately 111 stops using the expected
value. Hence, 207 out of 262 is very, very
unlikely.
42Bell-shaped Binomial Distributions
Remark If a probability distribution is
bell-shaped, then the Empirical Rule applies.
Observation If a binomial distribution is
bell-shaped, then the Empirical Rule applies.
43When is a Binomial Distribution Bell-shaped?
44Example
- According to the Higher Education Research
Institute, 55 of college freshmen in 4-year
colleges and universities during the 2003-2004
academic year were female. Suppose 12 freshmen
are randomly selected and the number of female
recorded. Find the following probabilities - (a) Exactly 7 of the 12 are female.
- (b) Five or more are female.
- (c) Fewer than five are female.
- (d) Between 7 and 10, inclusively, are female.
45(No Transcript)
46The Poisson Probability Distribution
- We now introduce a new discrete random variable
that counts the number of times that a particular
event occurs in a particular set (time or space).
For example, - the number of bacterial per unit volume of a
fluid - the number of customers at McDonalds that
order big Macs between 400-500 PM - the number of machines in a factory that break
down per day.
- Definition Let X be a random variable and
suppose that the possible values of X are x
0,1,2,3, and these values occur in a fixed
interval I (e.g., an interval of time). We say
that the random variable follows a Poisson
process if all of the following conditions hold - The probability of 2 or more occurrence of the
event on any sufficiently small subinterval of I
is zero. - If we look at the random variable on two
subintervals, I1 and I2, the probability of an
event will be same, provide the two subintervals
have the same length. - For any two non-overlapping subintervals,I1 and
I2, the number frequency of and event in I1 is
independent of the frequency of the event on I2.
Section 6.3
47Poisson Probability Distribution Function
48Graphs of the Poisson Distribution Function
49Example
- The phone calls to a computer software help desk
occur at the rate of 2.1 per minute between 1100
AM and noon. Find the following probabilities
for calls between 1115-1120 AM - There will be exactly 8 calls.
- There will be fewer than 8 calls.
- There will be at least 8 calls.
50Example
From 1900 to 2003 (104 years), the state of
Florida suffered 24 major hurricanes (category 3
to 5). What is the probability that in the year
2009, the state will see 3 major hurricanes?
What is the probability that it will see at most
3 major hurricanes?
51Example
The Vanderbilt Printing division duplicates
documents for distribution in the university.
For every 500 documents printed, one document is
not acceptable for distribution. Documents are
delivered in boxes with each box containing 100
documents. Assuming that the number of defective
documents in a box is distributed according to a
Poisson Distribution and we reject any box with
two or more defective documents, what percent of
the boxes are rejected?