Title: Ch 17: Probability Models
1Ch 17 Probability Models
- In this chapter we will introduce and work with 4
different probability models.
2Bernoulli Trials
- The basis for all 4 probability models we examine
in this chapter is the Bernoulli trial. - We have Bernoulli trials if
- there are only two possible outcomes (success and
failure). - the probability of success, p, is constant over
all trials. - the trials are independent.
3The Geometric Model
- The Geometric probability model tells us the
probability for a random variable that counts the
number of Bernoulli trials until the first
success. - Geometric models Geom(p), are completely
specified by one parameter, p, the probability of
success - p probability of success
- q 1 p probability of failure
- X of trials until the first success occurs
- P(X x) qx-1p
- We can also calculate the mean (expected of
trials until success) and standard deviation.
4Example Nuts over Nuts
- You get a job at the local chocolate factory
packing boxes of mixed chocolates. Chocolates
are stored in a huge bin (with thousands of
chocolates) and look identical, but there are 5
different flavors. 30 of the candies are solid,
15 are filled with caramel, 25 have a
butter-cream filling, 10 have a fruit filling
and the rest are filled with nuts. One of the
perks of your job is that you can sample their
products on the job. - If you pick one from the bin, whats the chance
it is nut-filled? - Whats the chance youd have to pick 4 chocolates
before finding one that is nut-filled? - How many chocolates would you expect to have to
pick before finding a nut-filled one? - What is the standard deviation for this picking
example?
5Another Geometric Example
- Remember the kid example from the last chapter
(Ch 16, 5)? We will modify it slightly. A
couple plans to have children until they get a
girl, no matter how many children it takes.
Again, we assume they are fertile and dont have
twins. - Create a probability model for the number of
children that they will have. (it might also
help to draw a tree diagram) - Find the expected of children
- Does this differ from the expected of children
when we had a cap of 3 kids? - Whats the chance the couple will have to have
more than 3 kids to get that daughter? More
than 5 kids? - Note- geometric models are usually easier to work
with if you leave in fractional form
6Independence
- One of the important requirements for Bernoulli
trials is that the trials must be independent. - When we dont have an infinite population, the
trials are not actually independent. But there is
a rule that allows us to pretend we have
independent trials - The 10 condition Bernoulli trials must be
independent. If that assumption is violated
because of a finite population considerations, it
is still okay to proceed as long as the sample is
smaller than 10 of the population. - In the chocolate example, we could use the
geometric because we were picking from a large
bin. But what if we were picking from a box of
10 chocolates where we knew 20 of the chocolates
were nut-filled Could we still use the
geometric calculate the probability that of
needing to finding a nut filled chocolate if we
grab 4 chocolates from the box?
7Limitations of the Geometric Model
- The geometric model is only useful for situations
where we experience failure until we find
success. - What if we want to figure out something more
general, like - Whats the chance that exactly 1 chocolate in a
group of 5 is nut filled? - Why cant we use the geometric for this?
- How many different ways could we select
chocolates to meet this condition? - How many different ways could we meet the
condition that 2 chocolates in a group of 5 are
nut filled?
8Examples of Combinations
- To have 1 nut-filled chocolate in a group of 5
- Nxxxx, xNxxx, xxNxx, xxxNx, xxxxN
- 5 different ways!
- To have 2 nut-filled chocolates in a group of 5
- NNxxx, xNNxx, xxNNx, xxxNN,
- NxNxx, xNxNx, xxNxN,
- NxxNx, xNxxN,
- NxxxN,
- 10 different ways!
- Luckily, theres an easier way to keep track of
this
9The Binomial Model
- A Binomial model tells us the probability for a
random variable that counts the number of
successes in a fixed number of Bernoulli trials. - Two parameters define the Binomial model n, the
number of trials and, p, the probability of
success. We denote this Binom(n, p).
10The Binomial Model Combinations
- In n trials, there are
-
- ways to have k successes.
- Read nCk as n choose k.
- n! n x (n-1) x x 2 x 1, and n! is read as n
factorial. - 0! 1
- In Excel, you can use COMBIN(N,K). You can also
use Excel compute the binomial, as shown in the
textbook. - The combination is how we account for the fact
that there are multiple ways to get k successes. - This calculation gets tough with large N.
- There will not be large combinations to calculate
on my tests.
11The Binomial Model (cont.)
- Binomial probability model for Bernoulli trials
- Binom(n,p)
- n number of trials
- p probability of success
- q 1 p probability of failure
- X of successes in n trials
- P(X x) nCx px qn-x
12Example More Nuts over Nuts
- Given the same setup from before, you fill a mini
box with 6 chocolates. Whats the chance that 2
of the chocolates are nut-filled? - What distribution do we need to use, binomial or
geometric? - Compute the probability of finding exactly 2
nut-filled chocolates in a box of 6 chocolates. - Compute the expected value and standard deviation
for the number of nut-filled chocolate in a box
of 6. - What do we do with fractional answers? What do
they mean?
13The Normal Model to the Rescue!
- When dealing with a large number of trials in a
Binomial situation, making direct calculations of
the probabilities becomes tedious (or outright
impossible). - Fortunately, as long as the Success/Failure
Condition holds, we can use the Normal model to
approximate Binomial probabilities. - The normal uses the same parameters for the mean
and standard deviation m np and - Be sure to check the Success/failure condition A
Binomial model can be considered approximately
Normal if we expect at least 10 successes and 10
failures in our trials - np 10 and nq 10
14Continuous Random Variables
- When we use the Normal model to approximate the
Binomial model, we are using a continuous random
variable to approximate a discrete random
variable. - Warning With continuous variables we need to
work with intervals, as the chance that a number
matches exactly is 0. - i.e. The probability that someone is exactly 64
tall on a continuous scale is the same as saying
they are 64.0000 tall, not 63.9999 or 64.0001
tall. We might instead mean Greater than
63.5, less than 64.5 - So, when we use the Normal model, we no longer
calculate the probability that the random
variable equals a particular value, but only that
it lies between two values (one of those values
may be zero or infinity.) -
15Example Yet more Nuts
- Given the same setup from before, you now have to
fill a mega-box with 100 chocolates. Whats the
chance that fewer than 25 of the chocolates are
nut-filled? - Why would it be difficult to compute this as a
binomial? - What probability model can we use in place of the
binomial? Justify why! - Whats the expected number of nut-filled
chocolates (and standard deviation) in the box - Whats the chance the box has fewer than 25
nut-filled chocolates? - and the chance the box has at least 25
nut-filled chocolates? - Aside one of the problems with the normal
approximation is that whether to interpret it as
P(Xlt 25) or P(X lt 25). I will accept either
interpretation.
16One Last Caveat About Using a Normal Model to
Approximate a Binomial...
- Substituting the normal for the binomial will not
give the exact same answers. (Theres often
around a 2-3 difference between them). - Problem 25 in the assigned problems works out
the probabilities for both the Binomial model and
its normal approximation. - Technically the binomial is more accurate (the
normal is only an approximation). - But the normal model is easier to compute, and
when conditions are satisfied, you can use it,
even with this inaccuracy! - And what can we do if np lt 10, but n is large?
Does this mean we have to use the binomial?
17The Poisson Model
- The Poisson probability model was originally
derived to approximate the Binomial model when
the probability of success, p, is very small and
the number of trials, n, is very large. - Rule of thumb 1 (not in book) You should have
more than 20 trials and no more than a 5 chance
of success for the Poisson. - Rule of thumb 2 (also not in the book) In
general, its a bad idea to use the binomial when
n gt 100, because rounding errors will make most
binomial computations unstable. - The parameter for the Poisson model is ?. To
approximate a Binomial model with a Poisson
model, just make their means match ? np. - The Poisson is a useful for looking at very rare
events that have major consequences for example
accidents, terrorist incidents, lottery
winnings... - It requires only that the events be independent
and that the mean number of occurrences stays
constant over time.
18The Poisson Model (cont.)
- Poisson probability model for successes
Poisson(?) - ? mean number of successes np
- X of successes
- e is an important mathematical constant (
2.71828)
19Example Nuts over Prizes
- The chocolate company is running a promotion,
where 0.1 of their chocolates are actually
chocolate-covered rubber balls that can be
redeemed for a cash prize (the company hops that
consumers wont swallow their candies whole and
choke!) Chris buys and eats 5 mega-boxes of
chocolates. Assuming the prize chocolates are
distributed randomly... - What model are we likely to use? Why cant we
use a Normal? Why dont we want to use a
Binomial? - What is the expected number of prizes (and SD)
- Compute the probability that Chris wins exactly 1
prize. - Compute the probability that Chris wins nothing.
- Compute the probability that Chris wins more than
one prize (Hint theres an easy way to do
this!)
20What Can Go Wrong?
- Be sure you have Bernoulli trials.
- You need two outcomes per trial, a constant
probability of success, and independence. - Remember that the 10 Condition provides a
reasonable substitute for independence. - Dont confuse Geometric and Binomial models.
- Dont use either the Normal approximation or the
Poisson with small n. - You need an expectation of at least 10 successes
and 10 failures to use the Normal approximation. - Conversely, avoid using the binomial model for
large n.
21What have we learned?
- Bernoulli trials show up in a lot of places, and
depending on the situation, can be represented
with one of 4 models... - Geometric model
- When were interested in the number of Bernoulli
trials until the next success. - Binomial model
- When were interested in the number of successes
in a certain number of Bernoulli trials. - Normal model
- To approximate a Binomial model when we expect at
least 10 successes and 10 failures. - Poisson model
- To approximate a Binomial model when there are a
very large number of trials (certainly more than
20!) and the probability of success (or failure)
is very small.
22When to Use Which Model A Guide
- All of these are based on Bernoulli Success or
No Success events - Geometric- Examines the question of how many
times do we have to try until we succeed? - Binomial- Examines multiple successes in a
series of trials - Normal- allows us to avoid the hassle of using
the binomial for lots of trials if we expect to
have at least several successes and several
failures - Poisson- allows us to avoid the hassle of using
the binomial for lots of trials if we expect not
to have many successes.
23Example Bernoulli or Not?
- 1 Can we use probability models based on
Bernoulli trials to investigate the following
situations? Why or Why not? What further
assumptions may be necessary? - We roll 50 dice to find the distribution of the
of spots on the faces. - How likely is it that in a group of 120, the
majority have type A blood, given that it is
found in 43 of the population? - We deal 5 cards from a standard deck and get all
hearts- whats the likelihood of that? - We pool 500 out of the 3000 potential voters to
see if they favor the budget. - A company realizes that 10 of its packages dont
seal properly. Whats the chance that more than
3 are defective in a pack of 24?
24Problems
- 32 (with some additional questions). Suppose
the probability of a major Bay Area earthquake on
any given day is 1 out of 10,000. - What distribution are we likely to use?
- Whats the expected number of major earthquakes
in the next 1000 days? - What is the probability there will be at least 1
major earthquake in the next 1000 days? - If the conditions for our probability model truly
hold in reality, does the chance of a major
earthquake in the next 1000 days change if we
just had one today?
25Example Old Exam Problem
- Luigi likes to ask ladies out. Every day he
makes it a habit to ask an unknown attractive
woman for her business card so he can call her
(he has already figured out they will give him a
fake number if he doesnt get proof!) Alas, most
(90) of the time he is rejected. Still, he
realizes if he asks enough, he will end up with a
few cards. - Find the probability he collects at least 1 card
in 4 days - He plans to continue this practice for the next
250 days to see if he can get at least 28 cards. - Can we use a normal to approximate this?
- Show why or why not (2 with calculations)
- Now find the probability
26Example Another Exam Question
- Your professor is absentminded. For instance,
she has a 20 chance of forgetting to close the
garage door when she drives to work. She wants
to determine what the probability is she will
forget to close the garage door exactly twice
during the 5 days that she drives to work next
week. Assume that whether she forgets on one day
will not effect other days. Calculate the
probability that she forgets to close the garage
door exactly twice in 5 days. - Her absent mindedness leads her to forget where
she left USB memory key, and she often has to
search for it all over her office. Every place
she looks, she has a 15 chance of finding it,
and she will, of course, stop tearing her office
apart once she finds it. Calculate the
probability that she needs to look exactly 3
places to find her key
27Another Exam Question (Cont.)
- The absentminded professor is often distracted
from paying attention to the road. Thus she has
a 0.3 chance of being involved in a minor
accident every day she commutes to school. If she
has an accident one day, it will not affect her
chances of having an accident on other days. - A) Calculate the mean number of accidents per
year given she commutes to work 200 days a year - B) Could we use a Normal distribution to
approximate this? Why or why not ? - C) Calculate the chance that she has probability
that she has exactly 1 accident that year - D) Calculate the chance she has more than 1
accident.