Title: Chapter 7 Probability Distributions, Information about the Future
1Chapter 7Probability Distributions,
Information about the Future
.
2 3Random Variable
- A random variable is the numerical outcome of a
random (non-deterministic) process. - Intuitively, any numerically measured variable
that possesses an uncertain outcome is a random
variable.
4Probability Distribution
- A probability distribution is a model which
describes a specific kind of random process. - Specifically, a probability distribution connects
a probability to each value the random variable
can assume.
5Probability Models
- Probability models are excellent descriptors of
random processes. - The following is a probability distribution (a
model) for the outcome of a coin toss.
6- Types of Random Variables
7Quantitative Random Variables
- Quantitative random variables are divided into
two classes. - ) Discrete
- ) Continuous
8Discrete Random Variables
- A discrete random variable is a random variable
which has a countable number of possible
outcomes. - The values that many discrete random variables
assume are the counting numbers from 0 to N,
where N depends upon the nature of the variable. - Example The number of pages in a standard math
textbook is a discrete random variable.
Statistics
9Continuous Random Variables
- A continuous random variable is a random variable
that can assume any value on a continuous
segment(s) of the real number line. - Heights, weights, volumes, and time measurements
are usually measured on a continuous scale. - These measurements can take on any value in some
interval.
10Discrete or Continuous?
- Classify the following as either a discrete
random variable or a continuous random variable. - 1. the speed of a train
- 2. the possible scores on the SAT exam
- 3. the number of pizzas eaten on a college
campus each day - 4. the daily takeoffs at Chicagos OHare
Airport - 5. the highest temperatures in Maine and Florida
tomorrow
11Answers
- 1. the speed of a train
- continuous random variable
- 2. the possible scores on the SAT exam
- discrete random variable
- 3. the number of pizzas eaten on a college
campus each day - discrete random variable
- 4. the daily takeoffs at Chicagos OHare
Airport - discrete random variable
- 5. the highest temperatures in Maine and Florida
tomorrow - continuous random variable
12Naming Convention
- Capital letters, such as X, will be used to refer
to the random variable. - example
- X number of cows in Texas
- Small letters, such as x, will refer to a
specific value of the random variable. - example
- x 1,498,000 cows in Texas
- Often the specific values will be subscripted x1,
x2, ..., xn.
13Describing a Discrete Random Variable
- State (Describe) the variable.
- List all of the possible values of the variable.
- Determine the probabilities of these values.
14Example 1 (die tossing)
- Random Phenomenon Toss a die and observe the
outcome of the toss. - X ?
- What are the possible values of X?
- What are the probabilities of each value?
15Example 1 - Solution
- Identify the Random Variable X
outcome of toss of die - All possible Values Integers between 1 and 6.
In this instance x1 1, x2 2, ..., x6 6. - Probability Distribution The outcomes of the
toss of a die and their probabilities are given
in the table. The probabilities are
deduced using the classical method and the
assumption of a fair die.
16Example 2
- Random Phenomenon The head nurse of the
pediatric division of the Sisters of Mercy
Hospital is trying to determine the capacity
requirement for the nursery. She realizes that
the number of babies born at the hospital each
day is a random variable. And, she will have to
develop a description of the randomness in order
to develop her plan. - X ?
- What are the possible values of X?
- What are the probabilities of each value?
- Not all discrete random variables have easily
definable probability distributions.
17Example 2 - Solution
- Identify the Random Variable X
of babies born at hospital each day - Range of All possible Values Integers between 0
and some large positive number. - Probability Distribution Unknown, but could be
estimated using the relative frequency idea in
conjunction with historical data on hospital
births.
18- Discrete Probability Distributions
19Discrete Probability Distributions
- The random variable concept is so general, that
it is not very useful by itself. - What would be useful is to determine what
numerical values the random variable could assume
and assess the probabilities of each of these
values.
20Discrete Probability Distributions
- A discrete probability distribution consists of
(a list of) all possible values of the random
variable with their associated probabilities. - The association of the possible values of a
random variable with their respective
probabilities can be expressed in three different
forms in a table, in a graph, and in an
equation.
21Characteristics
- Discrete probability distributions always have
two characteristics - 1. The sum of all of the probabilities must
equal 1. - The probability of any value must be between 0
and 1, inclusively. - Relative frequencies also share these properties
22Example 3 (Daily Sales)
- K. J. Johnson is a computer salesperson. During
the last year he has kept records on his computer
sales. He recognizes that his daily sales
constitute a random process and wishes to
determine the probability distribution for daily
sales. - The random variable is X number of computers
sold each day.
23Example 3 - Solution
- The probabilities for this random variable are
computed in the table based upon 200 days of
sales data obtained from Mr. Johnsons records
using the relative frequency concept.
The probability that Mr. Johnson will sell at
least 2 computers each day is calculated as
follows P(X 2) P(X2) P(X3) P(X4)
.3 .2 .2 .7. The probability that Mr.
Johnson will sell at most 2 computers each day is
calculated as follows P(X 2) P(X0)
P(X1) P(X2) .2 .1 .3 .6.
24Example 4, Is this a prob. Distn?
- Tell whether or not the following distribution is
a probability distribution. - If the distribution is not a probability
distribution, give the characteristic which is
not satisfied by the distribution.
25Example 4 - Solution
- Yes. All probabilities are between 0 and 1, and
the sum of the probabilities is 1.
26Example 5 , Is this a prob. Distn?
- Tell whether or not the following distribution is
a probability distribution. - If the distribution is not a probability
distribution, give the characteristic which is
not satisfied by the distribution.
27Example 5 - Solution
- No. The sum of the probabilities is greater than
one.
28Example 6, Is this a prob. Distn?
- Tell whether or not the following distribution is
a probability distribution. - If the distribution is not a probability
distribution, give the characteristic which is
not satisfied by the distribution.
29Example 6 - Solution
- No. You can't have negative probabilities.
30Example 7, Is this a prob. Distn?
- Tell whether or not the following distribution is
a probability distribution. - P(Xx) , for x 1, 2, 3, 4, 5
- If the distribution is not a probability
distribution, give the characteristic which is
not satisfied by the distribution.
31Example 7 - Solution
- No. See table. The sum of the probabilities is
15/16 which is less than one. - P(X)x/16 for x1 to 5 only is NOT a probability
distribution.
32- Expected Value E(X) of a random variable X
33Importance of E(X)
- One of the most important concepts in the
analysis of random phenomena is the notion of
expected value. - Expected value is important because it is a
summary statistic for a probability distribution.
- It can also be used as a criteria for comparing
alternative decisions in the presence of
uncertainty.
34What is Expected Value?
- Conceptually, expected value is closely allied
with the notion of mean or average. - The expected value is a weighted average, in
which each possible value of the random variable
is weighted by its probability. - Definition
- The expected value of a random variable X is the
mean of the random variable X. It is denoted by
E(X) and is given by computing the following
expression - E(X) ? x P(Xx)
- ? x P(x)
35Digression on Weighted Averages
- Weighted average of any measurement (say prices
Pt) is always (?t wtPt )/( ?t wt) - weighted averages are ubiquitous. Dow Jones
Industrial average is a weighted average - see
- http//www.indexarb.com/indexComponentWtsDJ.html
- SP 500 index is similar with weights available
at - http//www.indexarb.com/indexComponentWtsSP500.htm
l
36Average Value
- The expected value of a random variable should be
very close to the average value of a large number
of observations from the random process. - The larger the number of observations collected
the more likely the expected value will be close
to the average of the observations. - For discrete random variables the expected value
is rarely one of the possible outcomes of the
random variable.
37E(X) for Daily Sales
- The expected value of the probability
distribution given in Example 3 (daily
Sales) is computed in the table.
- In the long run, data coming from a random
process with this distribution should average
about 2.1.
38Using Expected Values to Compare Alternatives
- Two Investment Opportunities
- By calculating the expected values of the two
alternatives the information in each distribution
is condensed to a single point. - This point characterizes the center of the
random process and facilitates comparison. - In the long run, option B would be 500 more
profitable. - But on any one investment in option B, you may
lose as much as 3000 or make as much as 4000.
39Symbols
- The expected value, E(X), is the center point
for the random process. - The symbol mx is often used to represent E(X).
- mx E(X)
40 - Variance and Standard Deviation of a Discrete
Random Variable
41Variance of a Discrete Random Variable
- The expected value of a distribution measures
only one dimension of the random variable (its
central value). - To gauge the variability of a random variable we
need another measure similar to the variance
measure previously constructed but one which
accounts for the difference in probabilities of
the variable. - The variance of a discrete random variable X is
given by - The larger the variance the more variability in
the outcomes.
42Standard Deviation as a measure of risk
- The standard deviation is computed by taking the
square root of the variance. - In the Investment Opportunity problem the
variance and standard deviation are as follows. - Option A
- V(X) 3,090,000
- 1,757.84
- Option B
- V(X) 6,640,000
- 2,576.82
- The larger deviation reflects greater variability
in profits and increased risk.
43Sharpe Ratio
- Risk adjusted returns are compared by computing
the ratio - Average return / std. Dev of returns
- Option A 900/1,757.84
- 0.5199199
- Option B 1400/ 2,576.82
- 0.5433053
- Clearly Option B is slightly superior.
44Example 9
- Find the expected value, the variance, and the
standard deviation for a random variable with the
following probability distribution.
45Example 9 - Solution
46- Probability Distributions and their Functions
47Where do probability distributions come from?
- In previous examples the distribution is already
given. - In the real world there will be very few
instances in which the probability distribution
will be conveniently available. - Probabilities will have to be determined using
(i) classical, (ii) relative frequency, or (iii)
subjective probability. - Probability distributions can be constructed from
relative frequency distributions( depicted in
histograms)
48Probability Distribution Functions (p.d.f.)
- Four well known discrete distributions are the
discrete uniform, binomial, Poisson, and
hypergeometric. - Each of the discrete distributions possesses a
probability distribution function. - These math functions assign probabilities to each
value of the random variable.
49Discrete Probability Distribution Function
- Example of discrete p.d.f.
- P(Xx)1/4, if x1,2,3,4
- P(Xx)0, otherwise
- This pdf does assign some value to each possible
discrete number which can be the value of X. - All probability values need not be positive. They
can be zero!
50Determining Probabilities for a Specific Value
(just plug into the formula)
- To determine the probability for a specific
value, use the value as the argument to the
function. Pdf is P(Xx)x2/30. sum is unity - To determine the probability that X 3,
- P(X3) .
- To determine the probability that X 4,
- P(X4) .
51- The Discrete Uniform Distribution
52Definition
- In the discrete uniform distribution each value
of the random variable is assigned identical
probabilities. - This distribution is one of the simplest
probability distributions. - There are many situations in which the discrete
uniform distribution arises.
53Example 10
- The outcome of the throw of a single die.
- If the die is fair, then each of the outcomes
is equally likely. - Resulting probability distribution
54- The Binomial Distribution
55Definition
- A binomial experiment is a random experiment
which satisfies all of the following conditions - There are only two outcomes on each trial of the
experiment. - One of the outcomes is usually referred to as a
success, and the other as a failure. - The experiment consists of n identical trials as
described in (1).
56Definition Continued
- The probability of success on any one trial is
denoted by p and does not change from trial to
trial. - Note that the probability of a failure is 1- p
and also does not change from trial to trial. - The trials are independent.
- The binomial random variable is the count of the
number of successes in n trials.
57Example 11 (r.v. is of heads in 4 tosses)
- Toss a coin 4 times and record the number of
heads as the random variable. - The number of heads in 4 tosses is a binomial
random variable.
58Pascal Triangle to compute nCx Binomial
coefficients
- 1
- 1 1
- 1 2 1
- 1 3 3 1
- 1 4 6 4 1
- 1 5 10 10 5 1
- 1 6 15 20 15 6 1
- 4 coin x0,1,2,3,4 and corresp. P(x)
1/16,4/16,6/16,4/16, 1/16 - Note each row is created from previous row by
always starting and ending with ones, computing
sums of numbers from the previous row
59Ex. 11 Continued (r.v. is of heads in 4 tosses)
- There are only 2 outcomes, heads or not heads.
- The experiment will consist of 4 tosses of a
coin. - The probability of a getting a head (success) is
.5 and does not change from trial to trial. - The outcome of one toss will not affect other
tosses. - The variable of interest is the number of heads
in 4 tosses.
60The Binomial Probability Distribution Function
- where represents the number of possible
combinations of n objects taken x at a time
(without replacement) and is given by - , and 0!
1 - n the number of trials, and
- p the probability of a success.
61Calculating a Binomial Probability by plugging in
the Binomial formula
- The parameters of the distribution (n and p) as
well as the value of the random variable must be
specified. - For example, to determine the probability that
x3, given that n4 and p0.5, substitute those
values in the probability distribution function
as follows - Since,
62Example 12
- Calculate for the following combinations of
x and n. - A. n 4 and x 2
- B. n 12 and x 8
63Example 12 - Solution
64Binomial Tables
- In order to avoid tedious calculations, binomial
tables containing a large collection of binomial
distributions have been constructed. - These tables are found in the Appendix (pp. 523
-527).
65Example 13
- The random variable X is a binomial random
variable with n 12 and p .8. - Using the tables, find the following
- A. the probability that X is at most 4
- B. the probability that X is at least 1
- C. the probability that X is more than 10
66Example 13 - Solution
- A.
- P(X 4)
- P(X0)P(X1)P(X2)P(X3)P(X4)
- .0000.0000.0000.0001.0005
- .0006
- B.
- P(X 1)1-P(X0)1-.00001
- C.
- P(Xgt10)P(X11)P(X12)
- .2062.0687.2749
67The Shape of a Binomial
- If p is small, the distribution tends to be
skewed with a tail on the right.
68The Shape of a Binomial
- If p is near .5, the distribution is symmetrical.
69The Shape of a Binomial
- If p is large, the distribution tends to be
skewed with a tail on the left.
70The Expected Value and Variance of a Binomial
Random Variable
- The expected value of a binomial random variable
can be computed using the simple analytic
expression - E(X) np.
- The variance of a binomial random variable can be
computed using the analytic expression - V(X) np(1-p) n p q,
- Where q(1-p) by definition.
71Example 14
- The random variable X is a binomial random
variable with n12 and p.8. - A. Find the expected value of X.
- B. Find the variance of X.
- C. Find the standard deviation of X.
72Example 14 - Solution
- A. m E(X) np (12)(.8) 9.6
- B. V(X) np(1-p) (12)(.8)(1-.8)
- 1.92
- C. s 1.386
73 74Poisson vs. Binomial
- The Poisson distribution is similar to the
binomial in that the random variable represents a
count of the total number of successes. - The major difference between the two
distributions is that the Poisson does not have a
fixed number of trials. - Instead, the Poisson uses a fixed interval of
time or space in which the number of successes
are recorded.
75Definition
- In order to qualify as a Poisson random variable
an experiment must meet two conditions - Successes occur one at a time. That is, two or
more successes cannot occur at exactly the same
point in time or exactly at the same point in
space. - The occurrence of a success in any interval is
independent of the occurrence of a success in
any other interval.
76The Poisson Probability Distribution Function
- where the transcendental constant e is the limit
of (11/n)n as n becomes large without bound - e 2.71828..., and
- l average number of successes
- Note The variance of the Poisson distribution
is equal to the mean (l).
77The Shape of the Poisson Distribution
l .3
- As l increases, the shape of the Poisson
distribution begins to resemble a bell shaped
distribution.
l 3
l 12
78Poisson Random Variables for Time
- The majority of Poisson applications are related
to the number of occurrences of some event in a
specific duration of time. - The average number of successes that occur
within the duration of time will define the one
and only parameter l of the Poisson random
variable.
79Example 15 (morning calls)
- The number of calls received by an office on
Monday morning between 800 AM and 900 AM has a
Poisson distribution with l equal to 4.0. - X the number of calls received by an office on
Monday morning between 800 AM and 900 AM - l 4.0
80Example 15 - A (No morning calls)
- A. Determine the probability of getting no calls
between eight and nine in the morning.
81Example 15 - B (exactly 5 morning calls)
- B. Calculate the probability of getting exactly
five calls between eight and nine in the morning.
82Example 15 - C (E(X) of morning calls)
- C. What will be the expected number of calls
received by the office during this time period?
What is the variance? - Remember that l 4.0, and that l is the mean, or
average number of successes. - Also, remember that the variance of a Poisson
distribution is equal to the mean. - Thus, the expected number of calls is 4.0, and
the variance is also 4.0.
83Example 15 - D (Plot of morning calls)
- D. Graph the probability distribution of the
number of calls using values from the Poisson
distribution tables in the Appendix (pp. 528-532).
84Poisson Random Variables for Length and Space
- There are a number of Poisson applications that
measure the number of successes in some area or
length. - The average number of successes in the area or
length will define the l parameter of the Poisson
random variable.
85Example 16 (carpet weaving errors)
- The number of weaving errors in a twenty foot by
ten foot roll of carpet has a Poisson
distribution with l equal to 0.1. - X the number of weaving errors in a 20x10 foot
roll of carpet - l 0.1
86Example 16 - A (carpet weaving errors p.d.f.)
- A. Using the distribution tables , construct the
probability distribution for the carpet. l 0.1
87Example 16 - B (lt2 carpet weaving errors)
- B. What is the probability of observing less
than 2 errors in the carpet? l 0.1 - P(Xlt2) P(X0) P(X1)
- .9048 .0905 .9953
88Example 16 - C (gt5 carpet weaving errors)
- C. What is the probability of observing more
than 5 errors in the carpet? l 0.1 - P(Xgt5) 0
89- The Hypergeometric Distribution
90Hypergeometric vs. Binomial
- Similarities
- Both random variables have only two outcomes on
each trial of the experiment. - They both count the number of successes in n
trials of an experiment.
91Hypergeometric vs. Binomial
- Differences
- The hypergeometric distribution differs from the
binomial distribution in the lack of independence
between trials. the probability of success
will vary between trials for the hypergeometic
pdf . - In addition, hypergeometric distributions have
finite populations in which the TOTAL number of
successes and failures are known.
92The Hypergeometric Probability Distribution
Function
- A the largest number of succ-esses possible
in population - N the size of the total population
- n size of the sample drawn
93Example 17 (distn of memory chips)
- Suppose that a shipment from Matsua Semiconductor
contains 30 memory chips of which two are bad. - If a memory board requires 16 chips, what is the
probability distribution for the number of
defective chips on the memory board?
94Example 17 - Solution
- The random variable under consideration is given
as - X number of defective chips on the memory
board. - The three parameters of the distribution are
- A 2 (a success in this case is a defective
chip). - N 30, and
- n 16.
- The maximum value of X in this case is 2
min(2,16).
95Example 17 - Solution
96The Expected Value and Variance of a
Hypergeometric Random Variable
- The expected value of a hypergeometric random
variable can be obtained using the expression - The variance of a hypergeometric random variable
is
97Example 18, E(X) and V(X) for Hypergeomc
- Compute the expected value and variance for the
random variable for memorychips defined in
Example 17. - A 2, N 30, and n 16
- E(X) 16 ( ) 1.067
- If the experiment were repeated many times, the
average number of defective chips per board would
be slightly greater than 1.