Title: STAT 221
1STAT 221
- Chapter 5 Part A
- Discrete Probability Distributions
2The Random Variable - X
- In statistics, X is a symbol for the unknown
numerical value that expresses one possible
outcome of an experiment/event. - In an experiment, any one outcome value that is
assigned to X can be considered a random variable
because X can turn out to be (or assume) any
value in the set of all possible outcomes (and
that set is called the sample space).
3Depending on the experimental situation, X may be
Discrete or Continuous
- A random variable can be classified as being
either discrete or continuous depending on the
numerical values it can assume. - A discrete random variable is usually an integer
value and may assume either a finite number of
values or an infinite sequence of values. - A continuous random variable can be an integer or
non-integer value and may assume any numerical
value in an interval or collection of intervals.
4Examples of experimental outcome values
- Examples of discrete random variables
- Number of babies born on a particular day
- Number of customers arriving at a drive-thru in
any particular hour - Number of questions answered correctly on a test.
- Examples of continuous random variables
- Waiting time between customer arrivals
- Distance that a person can run
- Weight of a person
5This chapter discusses experiments whose outcomes
would be discrete random variables
- A discrete random variable can have a finite
number of values - Let x number of questions answered correctly
on a test. If the test has 100 questions, x can
be any of 101 possible values (0, 1, 2, 3, 4
100) note the finite upper limit of 100. - Or a discrete random variable can have an
infinite sequence of values - Let x number of customers arriving in one day
- where x can take on the values 0, 1, 2, . . .
note that there is no upper limit.
6Probability distributions of random variables
- x f (x)
- 0 .125 (1/8)
- .375 (3/8)
- .375 (3/8)
- 3 .125 (1/8)
- All possible values for x and their probabilities
of occurrence can be plotted on a chart and we
call that a probability distribution. - A probability distribution can actually be a
chart, table, or function that expresses the
probability of each possible outcome in an
experiment. - Here is a probability distribution presented as a
function f(x) - As you can see in this example, some x-values are
more likely to occur than others.
7A probability distribution
1
0
2
3
X of girls out of 3 children
8Developing Probability Distributions
- Probabilities of each possible value of random
variable x can be developed using any of the
three methods - Relative frequency approach look at past
history of occurrences to project expected
probabilities - Classical frequency approach use reasoning to
deduce the expected probability (Ex a die has 6
sides, therefore there is a 1-in-6 chance of
getting each outcome). - Subjective frequency approach make an educated
guess.
9Here is a probability distribution for rolling a
die developed using the classical approach
This is a uniform distribution the probability
of occurrence of each possible value of x (1, 2,
3, 4, 5, or 6) is the same (uniform).
10Here is a probability distribution developed
using the relative frequency approach
- In this example, DiCarlo Motors used past history
to project future probabilities of daily car
sales. - They know that over the past 300 days, they sold
no cars on 54 of those days, 117 days they sold 1
car, 72 days they sold 2 cars, 42 days they sold
3 cars, 12 days they sold 4, and 3 days they sold
5.
11Basic requirements of a probability distribution
f (x) 1 where x assumes all possible
values
0 ? f (x) ? 1 for every individual value of x
12Descriptive statistics for frequency distributions
- It is possible to calculate a mean, a variance
and standard deviation from a probability
distribution. - The mean is called the expected value and it is a
single measure of central tendency. - The variance and standard deviations are measures
of dispersion.
13Expected Value mean of a random variable
- The expected value of a random variable is the
same as the mean a single value that represents
the average of all possible outcomes. - In chapter 3, we calculated a mean value for x by
adding up all the actual outcomes (x-values) and
dividing by the number of outcomes. That approach
was appropriate when you have the actual observed
values for x. - When you do not have actual (observed) outcomes
but you only have a probability distribution for
expected outcomes, you must calculate the mean
value of x using this formula - E(x) ? ?xf(x)
14Example Rolling a die what is the Expected
Value?
- We dont have to roll a die hundreds of times and
actually observe the number of times we get a 1,
2, 3 etc. to know that the probability of rolling
each possible outcome (x) is 1-out-of-6. - But using the formula on the previous slide, we
can calculate the average outcome / expected
value - E(x) ? ?xf(x)
- 1 1/6 2 1/6 3 1/6 4 1/6 5
1/6 6 1/6 - 2.7
15Calculating a variance and standard deviation
- Recall that variance and standard deviations are
measures of dispersion (spread-out-ness). - Again, in Chapter 3 we learned formulas for
calculating the variance and standard deviation
based on actual/observed values for x. When all
we have is an frequency distribution for expected
x, we can still calculate a variance and standard
deviation for x but we must use this formula - Var(x) ? 2 ?(x - ?)2f(x)
16Example JSL Appliances the expected value or
mean
- x f(x) xf(x)
- 0 .40 .00
- 1 .25 .25
- 2 .20 .40
- 3 .05 .15
- 4 .10 .40
- E(x) 1.20
-
Here is the probability distribution for the
number of TVs they expect to sell each day.
The mean or the expected number of TV sets they
expect to sell in a day is 1.2
17Example JSL Appliances the variance and std
deviation
- x x - ? (x - ?)2 f(x) (x - ?)2f(x)
- 0 -1.2 1.44 .40 .576
- 1 -0.2 0.04 .25 .010
- 2 0.8 0.64 .20 .128
- 3 1.8 3.24 .05 .162
- 4 2.8 7.84 .10 .784
- 1.660 ? ?
-
- The variance of daily sales is 1.66 TV sets
squared. - The standard deviation of sales is 1.2884 TV
sets.
18Practice question Calculate the mean, variance,
and SDExcel worksheet Computers
- According to a survey, 95 of subscribers to the
Wall Street Journal Interactive Edition have a
computer at home. For those households, the
probability distribution for the number of
computers is given (see spreadsheet). - A. What is the expected value of the number of
computers per household? - B. What is the variance of the number of
computers per household? - C. Make three intelligent statements about the
number of computers owned by the Journals
subscribers.
19Open the file DataSetsForCh5 and select the
worksheet Computers
20To calculate the expected value for laptops,
first multiply the number of computers (x) by the
frequency f(x) for the first row C4 A4 B4
21Copy the formula in C4 down to C7 to multiply the
number of computers (x) by the frequency f(x) for
the other possible values of x.
22Sum the column C8 sum(C4C7)
23Copy the sum into cell C10 because that is the
expected value C10 C8
24To calculate the variance, start by squaring the
first value x D4 A4 A4
25Copy the formula in D4 down to D7 to square the
remaining x-values
26Multiply squared-x by the frequency of x for the
first x-value E4 D4 B4
27Copy the formula in E4 down to D7 to multiply the
other squared-xs by their frequencies.
28Sum the values in columns D and E D8
SUM(D4D7) E8 SUM(E4E8)
29To calculate the variance, subtract the square of
the mean from the sum of column E C11 E8
(C10 C10)
30To calculate the standard deviation, take the
square root of the variance C12 sqrt(C11)
Resave the file.
31Draw some conclusions about number of computers
owned by the Journals subscribers.
- The average Wall Street Journal reader has 1.42
computers in their household. - The majority of households have 1 computer.
- If we added and subtracted 2 standard deviations
(.75) from the mean, wed find that about 95 of
households have roughly 0 to 3 computers.
32Developing probability distributions for discrete
random variables
- Experiments are conducted when performing an
experimental study. - Experiments yield results - a dataset of x-values
each one considered to be a random value. - Some experiments have certain common
characteristics that enable them to classified as
a binomial experiment.
33The Binomial Experiment
- Here are some examples of binomial experiments
- Tossing a coin n times and seeing how many of
those outcomes are heads. - A salesman calling on potential clients and
seeing how many of those clients purchase his
product. - Gathering a sample of newborn babies and seeing
how many of them have a certain characteristic or
gender.
34Characteristics of a binomial experiment
- The experiment must consists of a sequence
repeating the same step n times (the step could
be flipping a coin, making a sales call,
determining if a baby has a certain
characteristic or gender, etc.) - For each step, there are two possible outcomes
referred to as success or failure. - The probability of each outcome is the same at
each step (so it must use with replacement
sampling). - The steps are independent (the outcome at each
step is not conditional on previous steps
outcomes).
35Practice Determine whether each of the given
experiments can be classified as a binomial
experiment
- Surveying 1012 people and recording whether there
is a should not response to this question Do
you think the cloning of humans should or should
not be allowed? - Rolling a loaded die 50 times and finding the
number of times that 5 occurs. - Determining whether each of 3000 heart pacemakers
is acceptable or defective. - Spinning a roulette wheel 12 times and finding
the number of times that the outcome is an odd
number.
36Answers
37The random variable (x) in a binomial experiment
- A binomial experiment can produce results (a data
set of values for x) where x is the number of
successes after n trials. - So if the experiment is tossing a coin 10
times, and on one trial we get this outcome
HHHTTHTHTH, then x 6 for 6 times we got heads
(a success). - We could just as easily have assigned tails to
be the success.
38Binomial experiments yield binomial distributions
- A binomial experiment can produce results (a data
set of values for x) that can be plotted into a
frequency distribution that we refer to as a
binomial frequency distribution.
39The binomial frequency distribution
- The binomial frequency distribution (using table
format) would list all the possible outcomes of x
and their probability of occurrence. - So, if our experiment is to toss a coin 10 times,
our frequency distribution might look like this - of times Heads Probability
- 0 .003
- 1 .007
- 2 .024
- 3 .078
- Etc.
- 10 .003
40But we dont have to actually perform n number of
trials to get an actual data set
- In a binomial experiment, we dont actually have
to have observed frequencies (that would be
obtained if we engaged in n number of trials or
have a sample size of n). - All we need to know is the probability of 1
success. In this case, all we need to know is the
probability of getting heads on one trial which
would be 50. - We can use the binomial formula to calculate
the probability of getting 0 heads, 1, 2, 3, etc.
on 10 trials.
41The binomial formula
where n number of trials x number of
successes among n trials p probability of
success in any one trial q probability of
failure in any one trial (q 1 p)
42Lets start with what is the probability of
getting 3 heads (out of 10 tosses)?
10 9 8
.125 .007813
3 2
.1172 or 11.72
43A closer examination of this formula
- Notice the combinations formula within the
binomial probability formula? - Thats because this part .53 .510-3
- is the probability of getting one specific
sequence of 3 heads and 7 tails (e. g.,
HHHTTTTTTT) - This part 10 9 8 / 3 2 1
- calculates the number of different combinations
that result in 3 heads and 7 tails
44What we have so far
- X P(x)
- 0 ?
- 1 ?
- 2 ?
- 3 11.72
- 4 ?
- 5 ?
- 6 ?
- 7 ?
- 8 ?
- 9 ?
- 10 ?
The probability distribution for getting x heads
out of 10 trials.
45Using Excel
- The binomial probability formula is one of
Excels built-in formulas. - To calculate the probabilities of the other 10
possible outcomes, lets use Excel. - Open the worksheet binomial and follow the steps
on the next slides.
46Open the file DataSetsForCH5 and click the
worksheet Coin Toss
471. Enter the n, p, and q values into cells C3
10 C4 .5 C5 1-C4
482. Position the cursor in B8. 3. To use Excels
binomdist( ) formula, click the fx button on the
formula bar to start the formula wizard.
494. In the category box, select Statistical,
5. In the select a function box, select
binomdist and click ok.
506. Enter the indicated value as function
arguments. Use absolute addressing where
appropriate. 7. Click ok.
518. See the formula result appear in cell B8. That
is the probability of getting 0 heads out of 10
trials.
529. Copy the formula in B6 down to B18 to
calculate the probabilities for the other 10
possible values for x.
5310. Use the button on the formatting toolbar
to format the values to percentage and use the
increase decimal places button to specify 2
decimal places. This table shows the probability
of each possible outcome of the experiment.
5411. Sum the probabilities to make sure they sum
to 100 B19 sum(B8B18)
55Obtaining a mean, variance, and standard
deviation from a binomial distribution
- Recall that we could derive a mean, variance,
and std. deviation from a (generic) probability
distribution. - When the probabilities distribution is
binomial, the formulas can be simplified to the
following - Expected Value E(x) ? np
- Variance Var(x) ? 2 np(1
- p) - Standard Deviation
56Calculate the expected value or mean H3 C3 C4
57Calculate the variance H4 C3 C4 C5
58Calculate the standard deviation H5 sqrt(H4)
Resave the file.
59Here is the frequency distribution notice the
symmetric characteristic of a binomial
distribution But as p gets larger (making q
smaller), the distribution becomes skewed to the
left.
60Binomial probability practice 1
- A multiple-choice test has six questions and each
has 4 possible answers, one of which is correct. - A. If you completely guess at the answer to each
question, what is the probability of getting
(exactly) the first two wrong and the last four
right (hint use the multiplication rule). - B. Begin with WWCCCC and make a complete list of
all the different possible arrangements of two
wrong and four right. What is the probability of
getting each of these arrangements? - C. What is the probability of getting exactly two
wrong and four right? (In other words, what is
the probability of getting any of these
arrangements?)
61a. If you completely guess at the answer to each
question, what is the probability of getting the
first two wrong and the last four right (hint
use the multiplication rule).
- P(C) .25
- P(W) .75
- P(WWCCCC) .75 .75 .25 .25 .25..25
- P(WWCCCC) .00220
62b. Begin with WWCCCC and make a complete list of
all the different possible arrangements of two
wrong and four right. Find the probability of
each arrangement.
CWWCCCCWCWCCCWCCWCCWCCCWCCWWCCCCWCWC
WWCCCCCWCWCCCCWCCWCCCWCCCWCCWCCCCWC WCCCCCW
CCWCCWCCCWWCCCCWCWCCCCWW
There are 15 ways to get 2 wrong and 4 right.
63b. What is the probability of each of these
arrangements?
- In each case, the probability is .752 .254
- Or .00220
64c. What is the probability of getting any of
these arrangements?
n is 6 p is .75 (the probability of
guessing wrong) x is 2 q is .25 (the
probability of guessing right)
The probability of getting any specific sequence
of 2 Ws and 4 Cs
Same as 15 (the number of ways of getting 2 Ws
and 4 Cs)
65Binomial probability practice 2
- If you run a red traffic light at an intersection
equipped with a camera monitor, there is a .1
probability that you will be given a ticket. If
you run a red traffic light at this intersection
five different times, what is the probability of
getting at least one ticket?
66The possible outcomes
- The (6) possible outcomes are
- 0 tickets
- 1 ticket
- 2 tickets
- 3 tickets
- 4 tickets
- 5 tickets
None
At least 1
67Remember how we solved it by finding the
probability of the complement?
- If A is (getting at least one ticket)
- Then not A is (getting 0 tickets)
- Rather than find the P(A), find the P(not A) and
subtract it from 1 - P (0 tickets)
- P(.95) (.9) (.9) (.9) (.9) (.9) ) .59
- 1 - .59 .41
- Or 41
68Now we know how to solve it this way
- P(at least one)
- P(1) P(2) P(3) P(4) P(5)
- Where p .1 and q .9
- Refer to worksheet NumberOfTickets
- Use Excels BINOMDIST( ) to find the
probabilities and sum them.
69(No Transcript)
70Binomial probability practice 3
- Assume several samples of 6 AA flights were
selected at random and the average on-time
arrival rate of these flights was .723 (72.3) - Based on this success rate of 72.3, create a
frequency distribution. - Find the probability that at most two American
Airlines flights arrive on time. - Find the probability that at least one AA flight
arrives on time.
71(No Transcript)