STAT 221

About This Presentation

Title:

STAT 221

Description:

The Random Variable - X. In statistics, X is a symbol for the unknown' numerical value that ... Let x = number of questions answered correctly on a test. ... – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 72

Provided by: margaret1

Category:

Tags: stat

more less

Transcript and Presenter's Notes

Title: STAT 221

1
STAT 221

Chapter 5 Part A
Discrete Probability Distributions

2
The Random Variable - X

In statistics, X is a symbol for the unknown
numerical value that expresses one possible
outcome of an experiment/event.
In an experiment, any one outcome value that is
assigned to X can be considered a random variable
because X can turn out to be (or assume) any
value in the set of all possible outcomes (and
that set is called the sample space).

3
Depending on the experimental situation, X may be
Discrete or Continuous

A random variable can be classified as being
either discrete or continuous depending on the
numerical values it can assume.
A discrete random variable is usually an integer
value and may assume either a finite number of
values or an infinite sequence of values.
A continuous random variable can be an integer or
non-integer value and may assume any numerical
value in an interval or collection of intervals.

4
Examples of experimental outcome values

Examples of discrete random variables
Number of babies born on a particular day
Number of customers arriving at a drive-thru in
any particular hour
Number of questions answered correctly on a test.
Examples of continuous random variables
Waiting time between customer arrivals
Distance that a person can run
Weight of a person

5
This chapter discusses experiments whose outcomes
would be discrete random variables

A discrete random variable can have a finite
number of values
Let x number of questions answered correctly
on a test. If the test has 100 questions, x can
be any of 101 possible values (0, 1, 2, 3, 4
100) note the finite upper limit of 100.
Or a discrete random variable can have an
infinite sequence of values
Let x number of customers arriving in one day
where x can take on the values 0, 1, 2, . . .
note that there is no upper limit.

6
Probability distributions of random variables

x f (x)
0 .125 (1/8)
.375 (3/8)
.375 (3/8)
3 .125 (1/8)

All possible values for x and their probabilities
of occurrence can be plotted on a chart and we
call that a probability distribution.
A probability distribution can actually be a
chart, table, or function that expresses the
probability of each possible outcome in an
experiment.
Here is a probability distribution presented as a
function f(x)
As you can see in this example, some x-values are
more likely to occur than others.

7
A probability distribution
1
0
2
3
X of girls out of 3 children
8
Developing Probability Distributions

Probabilities of each possible value of random
variable x can be developed using any of the
three methods
Relative frequency approach look at past
history of occurrences to project expected
probabilities
Classical frequency approach use reasoning to
deduce the expected probability (Ex a die has 6
sides, therefore there is a 1-in-6 chance of
getting each outcome).
Subjective frequency approach make an educated
guess.

9
Here is a probability distribution for rolling a
die developed using the classical approach
This is a uniform distribution the probability
of occurrence of each possible value of x (1, 2,
3, 4, 5, or 6) is the same (uniform).
10
Here is a probability distribution developed
using the relative frequency approach

In this example, DiCarlo Motors used past history
to project future probabilities of daily car
sales.
They know that over the past 300 days, they sold
no cars on 54 of those days, 117 days they sold 1
car, 72 days they sold 2 cars, 42 days they sold
3 cars, 12 days they sold 4, and 3 days they sold
5.

11
Basic requirements of a probability distribution
f (x) 1 where x assumes all possible
values
0 ? f (x) ? 1 for every individual value of x
12
Descriptive statistics for frequency distributions

It is possible to calculate a mean, a variance
and standard deviation from a probability
distribution.
The mean is called the expected value and it is a
single measure of central tendency.
The variance and standard deviations are measures
of dispersion.

13
Expected Value mean of a random variable

The expected value of a random variable is the
same as the mean a single value that represents
the average of all possible outcomes.
In chapter 3, we calculated a mean value for x by
adding up all the actual outcomes (x-values) and
dividing by the number of outcomes. That approach
was appropriate when you have the actual observed
values for x.
When you do not have actual (observed) outcomes
but you only have a probability distribution for
expected outcomes, you must calculate the mean
value of x using this formula
E(x) ? ?xf(x)

14
Example Rolling a die what is the Expected
Value?

We dont have to roll a die hundreds of times and
actually observe the number of times we get a 1,
2, 3 etc. to know that the probability of rolling
each possible outcome (x) is 1-out-of-6.
But using the formula on the previous slide, we
can calculate the average outcome / expected
value
E(x) ? ?xf(x)
1 1/6 2 1/6 3 1/6 4 1/6 5
1/6 6 1/6
2.7

15
Calculating a variance and standard deviation

Recall that variance and standard deviations are
measures of dispersion (spread-out-ness).
Again, in Chapter 3 we learned formulas for
calculating the variance and standard deviation
based on actual/observed values for x. When all
we have is an frequency distribution for expected
x, we can still calculate a variance and standard
deviation for x but we must use this formula
Var(x) ? 2 ?(x - ?)2f(x)

16
Example JSL Appliances the expected value or
mean

x f(x) xf(x)
0 .40 .00
1 .25 .25
2 .20 .40
3 .05 .15
4 .10 .40
E(x) 1.20

Here is the probability distribution for the
number of TVs they expect to sell each day.
The mean or the expected number of TV sets they
expect to sell in a day is 1.2
17
Example JSL Appliances the variance and std
deviation

x x - ? (x - ?)2 f(x) (x - ?)2f(x)
0 -1.2 1.44 .40 .576
1 -0.2 0.04 .25 .010
2 0.8 0.64 .20 .128
3 1.8 3.24 .05 .162
4 2.8 7.84 .10 .784
1.660 ? ?
The variance of daily sales is 1.66 TV sets
squared.
The standard deviation of sales is 1.2884 TV
sets.

18
Practice question Calculate the mean, variance,
and SDExcel worksheet Computers

According to a survey, 95 of subscribers to the
Wall Street Journal Interactive Edition have a
computer at home. For those households, the
probability distribution for the number of
computers is given (see spreadsheet).
A. What is the expected value of the number of
computers per household?
B. What is the variance of the number of
computers per household?
C. Make three intelligent statements about the
number of computers owned by the Journals
subscribers.

19
Open the file DataSetsForCh5 and select the
worksheet Computers
20
To calculate the expected value for laptops,
first multiply the number of computers (x) by the
frequency f(x) for the first row C4 A4 B4
21
Copy the formula in C4 down to C7 to multiply the
number of computers (x) by the frequency f(x) for
the other possible values of x.
22
Sum the column C8 sum(C4C7)
23
Copy the sum into cell C10 because that is the
expected value C10 C8
24
To calculate the variance, start by squaring the
first value x D4 A4 A4
25
Copy the formula in D4 down to D7 to square the
remaining x-values
26
Multiply squared-x by the frequency of x for the
first x-value E4 D4 B4
27
Copy the formula in E4 down to D7 to multiply the
other squared-xs by their frequencies.
28
Sum the values in columns D and E D8
SUM(D4D7) E8 SUM(E4E8)
29
To calculate the variance, subtract the square of
the mean from the sum of column E C11 E8
(C10 C10)
30
To calculate the standard deviation, take the
square root of the variance C12 sqrt(C11)
Resave the file.
31
Draw some conclusions about number of computers
owned by the Journals subscribers.

The average Wall Street Journal reader has 1.42
computers in their household.
The majority of households have 1 computer.
If we added and subtracted 2 standard deviations
(.75) from the mean, wed find that about 95 of
households have roughly 0 to 3 computers.

32
Developing probability distributions for discrete
random variables

Experiments are conducted when performing an
experimental study.
Experiments yield results - a dataset of x-values
each one considered to be a random value.
Some experiments have certain common
characteristics that enable them to classified as
a binomial experiment.

33
The Binomial Experiment

Here are some examples of binomial experiments
Tossing a coin n times and seeing how many of
those outcomes are heads.
A salesman calling on potential clients and
seeing how many of those clients purchase his
product.
Gathering a sample of newborn babies and seeing
how many of them have a certain characteristic or
gender.

34
Characteristics of a binomial experiment

The experiment must consists of a sequence
repeating the same step n times (the step could
be flipping a coin, making a sales call,
determining if a baby has a certain
characteristic or gender, etc.)
For each step, there are two possible outcomes
referred to as success or failure.
The probability of each outcome is the same at
each step (so it must use with replacement
sampling).
The steps are independent (the outcome at each
step is not conditional on previous steps
outcomes).

35
Practice Determine whether each of the given
experiments can be classified as a binomial
experiment

Surveying 1012 people and recording whether there
is a should not response to this question Do
you think the cloning of humans should or should
not be allowed?
Rolling a loaded die 50 times and finding the
number of times that 5 occurs.
Determining whether each of 3000 heart pacemakers
is acceptable or defective.
Spinning a roulette wheel 12 times and finding
the number of times that the outcome is an odd
number.

36
Answers

37
The random variable (x) in a binomial experiment

A binomial experiment can produce results (a data
set of values for x) where x is the number of
successes after n trials.
So if the experiment is tossing a coin 10
times, and on one trial we get this outcome
HHHTTHTHTH, then x 6 for 6 times we got heads
(a success).
We could just as easily have assigned tails to
be the success.

38
Binomial experiments yield binomial distributions

A binomial experiment can produce results (a data
set of values for x) that can be plotted into a
frequency distribution that we refer to as a
binomial frequency distribution.

39
The binomial frequency distribution

The binomial frequency distribution (using table
format) would list all the possible outcomes of x
and their probability of occurrence.
So, if our experiment is to toss a coin 10 times,
our frequency distribution might look like this
of times Heads Probability
0 .003
1 .007
2 .024
3 .078
Etc.
10 .003

40
But we dont have to actually perform n number of
trials to get an actual data set

In a binomial experiment, we dont actually have
to have observed frequencies (that would be
obtained if we engaged in n number of trials or
have a sample size of n).
All we need to know is the probability of 1
success. In this case, all we need to know is the
probability of getting heads on one trial which
would be 50.
We can use the binomial formula to calculate
the probability of getting 0 heads, 1, 2, 3, etc.
on 10 trials.

41
The binomial formula
where n number of trials x number of
successes among n trials p probability of
success in any one trial q probability of
failure in any one trial (q 1 p)
42
Lets start with what is the probability of
getting 3 heads (out of 10 tosses)?
10 9 8
.125 .007813
3 2
.1172 or 11.72
43
A closer examination of this formula

Notice the combinations formula within the
binomial probability formula?
Thats because this part .53 .510-3
is the probability of getting one specific
sequence of 3 heads and 7 tails (e. g.,
HHHTTTTTTT)
This part 10 9 8 / 3 2 1
calculates the number of different combinations
that result in 3 heads and 7 tails

44
What we have so far

X P(x)
0 ?
1 ?
2 ?
3 11.72
4 ?
5 ?
6 ?
7 ?
8 ?
9 ?
10 ?

The probability distribution for getting x heads
out of 10 trials.
45
Using Excel

The binomial probability formula is one of
Excels built-in formulas.
To calculate the probabilities of the other 10
possible outcomes, lets use Excel.
Open the worksheet binomial and follow the steps
on the next slides.

46
Open the file DataSetsForCH5 and click the
worksheet Coin Toss
47
1. Enter the n, p, and q values into cells C3
10 C4 .5 C5 1-C4
48
2. Position the cursor in B8. 3. To use Excels
binomdist( ) formula, click the fx button on the
formula bar to start the formula wizard.
49
4. In the category box, select Statistical,
5. In the select a function box, select
binomdist and click ok.
50
6. Enter the indicated value as function
arguments. Use absolute addressing where
appropriate. 7. Click ok.
51
8. See the formula result appear in cell B8. That
is the probability of getting 0 heads out of 10
trials.
52
9. Copy the formula in B6 down to B18 to
calculate the probabilities for the other 10
possible values for x.
53
10. Use the button on the formatting toolbar
to format the values to percentage and use the
increase decimal places button to specify 2
decimal places. This table shows the probability
of each possible outcome of the experiment.
54
11. Sum the probabilities to make sure they sum
to 100 B19 sum(B8B18)
55
Obtaining a mean, variance, and standard
deviation from a binomial distribution

Recall that we could derive a mean, variance,
and std. deviation from a (generic) probability
distribution.
When the probabilities distribution is
binomial, the formulas can be simplified to the
following
Expected Value E(x) ? np
Variance Var(x) ? 2 np(1
- p)
Standard Deviation

56
Calculate the expected value or mean H3 C3 C4
57
Calculate the variance H4 C3 C4 C5
58
Calculate the standard deviation H5 sqrt(H4)
Resave the file.
59
Here is the frequency distribution notice the
symmetric characteristic of a binomial
distribution But as p gets larger (making q
smaller), the distribution becomes skewed to the
left.
60
Binomial probability practice 1

A multiple-choice test has six questions and each
has 4 possible answers, one of which is correct.
A. If you completely guess at the answer to each
question, what is the probability of getting
(exactly) the first two wrong and the last four
right (hint use the multiplication rule).
B. Begin with WWCCCC and make a complete list of
all the different possible arrangements of two
wrong and four right. What is the probability of
getting each of these arrangements?
C. What is the probability of getting exactly two
wrong and four right? (In other words, what is
the probability of getting any of these
arrangements?)

61
a. If you completely guess at the answer to each
question, what is the probability of getting the
first two wrong and the last four right (hint
use the multiplication rule).

P(C) .25
P(W) .75
P(WWCCCC) .75 .75 .25 .25 .25..25
P(WWCCCC) .00220

62
b. Begin with WWCCCC and make a complete list of
all the different possible arrangements of two
wrong and four right. Find the probability of
each arrangement.
CWWCCCCWCWCCCWCCWCCWCCCWCCWWCCCCWCWC
WWCCCCCWCWCCCCWCCWCCCWCCCWCCWCCCCWC WCCCCCW
CCWCCWCCCWWCCCCWCWCCCCWW
There are 15 ways to get 2 wrong and 4 right.
63
b. What is the probability of each of these
arrangements?

In each case, the probability is .752 .254
Or .00220

64
c. What is the probability of getting any of
these arrangements?
n is 6 p is .75 (the probability of
guessing wrong) x is 2 q is .25 (the
probability of guessing right)
The probability of getting any specific sequence
of 2 Ws and 4 Cs
Same as 15 (the number of ways of getting 2 Ws
and 4 Cs)
65
Binomial probability practice 2

If you run a red traffic light at an intersection
equipped with a camera monitor, there is a .1
probability that you will be given a ticket. If
you run a red traffic light at this intersection
five different times, what is the probability of
getting at least one ticket?

66
The possible outcomes

The (6) possible outcomes are
0 tickets
1 ticket
2 tickets
3 tickets
4 tickets
5 tickets

None
At least 1
67
Remember how we solved it by finding the
probability of the complement?

If A is (getting at least one ticket)
Then not A is (getting 0 tickets)
Rather than find the P(A), find the P(not A) and
subtract it from 1
P (0 tickets)
P(.95) (.9) (.9) (.9) (.9) (.9) ) .59
1 - .59 .41
Or 41

68
Now we know how to solve it this way

P(at least one)
P(1) P(2) P(3) P(4) P(5)
Where p .1 and q .9
Refer to worksheet NumberOfTickets
Use Excels BINOMDIST( ) to find the
probabilities and sum them.

69
(No Transcript)
70
Binomial probability practice 3

Assume several samples of 6 AA flights were
selected at random and the average on-time
arrival rate of these flights was .723 (72.3)
Based on this success rate of 72.3, create a
frequency distribution.
Find the probability that at most two American
Airlines flights arrive on time.
Find the probability that at least one AA flight
arrives on time.

71
(No Transcript)

Write a Comment

User Comments (0)