Title: Probability distributions
1Probability distributions
- Binomial probability distribution (ASW, section
5.4) - Using Excel for the binomial (ASW, pp. 222-223)
- Uniform probability distribution (ASW, section
6.1) - Normal probability distribution (ASW, section
6.2) - Bring the text to class on Monday and Wednesday,
Sept. 29 and October 1. We will be using Tables
1 and 5 of Appendix B of ASW.
Notes for September 29, 2008
2Variance (ASW, 195)
- The variance of a probability distribution is the
expected value of the squares of the differences
of the random variable x from the mean µ.
Symbolically, - Var(x) s2 ?(x µ)2 f(x)
- The Greek symbol s is sigma.
- The variance can be difficult to calculate and
interpret. It is in units that are the square of
the random variable x. Partly because of this,
in statistical work it is more common to use the
square root of the variance or s. The standard
deviation has the same units as x.
3Variance of x, number of females selected
If a random sample of 3 persons is obtained from
a large population composed of half females and
half males, the expected number of females
selected is µ 1.5. The variance of the number
of females selected is Var(x) s2 ?(x µ)2
f(x) 0.75. The standard deviation is the
square root of 0.75, so that s 0.866.
4Sample and population variance
- The variance of a sampling distribution is (ASW,
195) - This is equivalent to the variance of a
population (ASW, 92) - Note that the variance of a sample is
Var(x) s2 ?(x µ)2 f(x)
5Unbiased estimator
- The expected value of s2 is equal to s2, a
characteristic that is referred to as an unbiased
estimate. That is, - Using (n-1) in the denominator of s2, rather
than n, produces this unbiased estimate. - The concept of biased and unbiased estimators is
important in constructing good estimators and is
a major consideration in econometric work. - When using Excel to estimate mean and standard
deviation, make sure you use the proper formulae.
6Binomial probability distribution (ASW, 200)
- A binomial experiment is a probability experiment
with the following characteristics - The experiment has n identical trials.
- Two outcomes are possible on each trial one
trial is termed a success and the other is termed
a failure. - The probability of a success occurring on each
trial is p. This probability p is the same on
each trial. - Since the outcome must either be a success or
failure, a failure is the complement of a success
and the probability of a failure is 1-p. (Some
texts refer to this probability as q, that is,
q1-p). - The trials are independent of each other.
7Given the above conditions
- The binomial probability distribution provides
the probability of x successes in n trials, where
x0, 1 ,2, 3, , n. - Note that there are only two parameters that
determine binomial probabilities - n the number of trials.
- p the probability of success.
- Successive trials must be independent of each
other. That is, the outcome of any one trial
must not affect the probability of success or
failure for any other trial. - P (success ? failure on any other trial) p
- P (success ? success on any other trial) p
8 i
Example number of females selected in a random
sample of size 3 from a large population of half
males and half females.
x is the number of females selected and f(x) is
the probability of x females being selected
The above distribution is a binomial probability
distribution with success defined as selecting a
female. There are n 3 independent trials, the
probability of success is p 0.5, and x is the
number of successes. In this experiment,
selecting a male is termed a failure, and the
probability of selecting a male is 1-p 1-0.5
0.5.
9Formula for binomial probability
If n is the number of trials of the binomial
experiment and p is the probability of success,
then the probability of x successes in n trials
of the experiment is given by the probability
function f(x), defined as follows
10Using the binomial formula
11Combinations and permutations (ASW, 146-147)
- Permutations the number of ways of arranging N
objects, taken n at a time, where the order of
the objects is taken into account, is - Where is the number of possible
combinations of N objects, taken n at a time,
where the order of the objects does not matter.
-
12Rationale for the binomial formula
- Probability of x successes and (n-x) failures is
- This is and represents the
probability of any particular sequence of x
successes and (n-x) failures. - And there are ways of arranging these x
successes and (n-x) failures. To obtain the
probability of x successes in n trials, multiply
the probability of any particular sequence by
this combination.
13Example selection of Saskatchewan workers,
classified by years of education and wages and
salaries
- From all these workers, randomly select 13
workers with 14-17 years of education. What is
the probability that exactly 8 of these will have
incomes of 45,000 or more? Probability of 8 or
more? - A random sample from a large population means
that successive selections are independent of
each other. There are n 13 workers selected.
If success is defined as the probability of
selecting a worker with an income of 45,000 or
more, the probability of success p 82/230
0.357. - Probability of 8 with 45,000 or more income
0.0373. See the following slides for the
calculation.
14Using the formula
15Probabilities to 3 decimal places
The probability of 8 or more successes is the sum
of the probabilities of 8, 9, 10, 11, 12, or 13
successes. This is 0.0373 0.0115 0.0026
0.0004 0.0000 0.0000 0.0518.
16Using an Excel worksheet to obtain the
probabilities
17Formula in Excel
- n13 is in cell a1 and p0.357 is in cell a2.
18Mean and standard deviation
- For a binomial distribution with n trials and p
as the probability of success, the mean or
expected value and variance of the random
variable x is - For the sex distribution of n 3 individuals,
the expected number of females selected is 3
0.5 1.5 and the variance is 3 0.5 0.5
0.75, as we previously determined. - For the experiment of selecting 13 individuals,
the mean number of those with 14-17 years of
education is 13 0.357 4.64, the variance is
13 0.357 0.643 2.984, and the standard
deviation is 1.727.
19Examples where binomial could be applied
- The probability of ten or more heads when
flipping a coin twelve times. - The probability of 6 threes in 15 rolls of a die.
- The probability of selecting 56 or more
unemployed persons in a random sample of 500
workers in the province of Saskatchewan. - The probability that the tax form has been
correctly completed in a random sample of 500
Canadian taxpayers. - The probability that more than 1/3 of a sample
1,000 Saskatchewan residents has a university
degree.
20Why might the binomial not apply in the following?
- The probability that there will be snow on 20 or
more days in January? - The probability of 6 threes and 7 fives in 25
rolls of a die. - The probability that the UR Rams win all of their
remaining football games? - The probability that the Conservatives win 155 or
more seats, among the 308 up for election, in the
coming federal election. - The probability that 10 or more automobiles in a
car dealers lot in Regina will have defective
transmissions. - The probability that fifty or more clients of the
Regina Food Bank , during the month of October,
will be unemployed.
21Extending the binomial
- When the number of trials of a binomial
experiment is large, ie. if n is large, then it
is time-consuming to compute binomial
probabilities without a computer. - In this case, it is possible to use the normal
distribution to approximate the binomial
probabilities. See ASW, section 6.3. - In addition, we may not be as interested in the
number of successes as in the proportion of
successes. In this case, the normal
approximation can be used to obtain probabilities
for the proportion p of the times that a success
occurs. See ASW, section 7.6.
22Later on Monday or on Wednesday
- Uniform probability distribution.
- Normal probability distribution.
- Normal approximation to the binomial probability
distribution.