Title: Commonly-Used Discrete Distributions
1Commonly-Used Discrete Distributions
- Stat 700 Lecture 06
- 9/25/2001
2Overview of Lecture
- Commonly-used discrete distributions and when
they are appropriate - Discrete Uniform
- Bernoulli
- Binomial
- Hypergeometric
- Poisson
3Commonly-Used Discrete Distributions
- A discrete random variable U is uniformly
distributed over the set 1, 2, 3, , N, denoted
U ? UNIF1,2,,N, where N is a positive integer,
if its probability function is given by - p(u) 1/N for u 1, 2, 3, , N.
- This is the model which assigns equal
probabilities to the possible values of U. - Applicability Random allocation
4Uniform continued
- For U ? UNIF1,2,,N
- Mean ? (N 1)/2.
- Variance ?2 (N2 - 1)/12.
- For example, suppose the number of students in
Stat 700 who will be taking Stat 701 is uniformly
distributed over 1,2,,12. Then N 12, so - ? (12 1)/2 6.5
- ?2 (122 - 1)/12 143/12 11.92.
5Bernoulli Trials
- An experiment is a Bernoulli trial if its
outcomes could be classified into two categories
Success (S) or Failure (F) we could also use
other labels such as Good or Defective,
Female or Male, etc.. - We denote by p the probability of a Success and
by q 1 - p the probability of a Failure. - If X 1 denotes Success and X 0 denotes
Failure, X is called a Bernoulli random
variable. - Its probability function is p(0) q and p(1)
p.
6Examples of Bernoulli Trials
- Observing the outcome of a surgery Success or
Failure. How to determine p? - Observing whether for an insured person the event
insured against occurs or not. Again, what will
be p? - Observing whether a machine will function for a
specified period. - Observing whether a genetic mutation has
occurred. - Determining if a biological organism will survive
for a specified time. - Observing whether it rains or not for a given day.
7Mean and Variance for a Bernoulli Random Variable
- If X has a Bernoulli distribution with success
probability p, we write X ? BER(p). - For X ? BER(p)
- Mean ? p since ? (0)(q) (1)(p) p.
- Variance ?2 pq since by the definitional
formula, - ?2 (0 - p)2(q) (1 - p)2(p) p2q q2p
pq(pq) pq because p q 1.
8Binomial Experiments and Variables
- Consider now an experiment with the following
characteristics - a) it consists of n Bernoulli trials n
replications - b) each of the n Bernoulli trials has
probability p of Success constant probability
of success - c) the n Bernoulli trials are independent.
trials dont affect each other - Then such an experiment is called a binomial
experiment with parameters n and p. - If X is the random variable that counts the
number of successes out of the n trials, then X
is said to be a binomial random variable with
parameters n and p.
9Examples of Binomial Expts/Variables
- Example 1 Draw a sample of size n 10, with
replacement, from a box containing 20 red and 30
blue balls, and let X denote the number of red
balls in the sample. - Note the following
- success is getting a red
- n 10 (number of draws)
- p 20/50 .4 (probability of red per trial)
- independent trials hold because sampling is with
replacement - X counts the number of reds (successes)
- X is binomial with n 10 and p .4.
10Examples continued
- Example 2 A very large population of people is
such that 5 have a certain type of disease, and
95 does not have the disease. A sample of size
n 100 is drawn from this population (without
replacement), and X denotes the number of people
in the sample who has the disease. - Note that
- trial is picking a person and determining if
he/she has the disease (success) - independence is approximately satisfied because
of the large size of the population - p 0.05 (approximately) for each of the trials
- Therefore, X is (approximately) binomial with n
100 and p 0.05.
11Probability Function of a Binomial Random Variable
- Let X be a binomial random variable with
parameters n and p, denoted X ? BIN(n,p). Then,
the range of X is R 0,1,2,,n. - Its probability function is given by recall that
q 1 - p
12Parameters of Binomial Distribution
- For X a binomial random variable with parameters
n and p, its mean, variance, and standard
deviation are given by
13Applications of the Binomial Distribution
- Example 1 continued In this example, X, which
counts the number of red balls in the sample of
size 10 is binomial with n 10 and p 0.40.
Therefore, the probability function is given by
14Example 1 continued
- Mean ? np 10(.4) 4
- Variance ?2 npq (10)(.4)(.6) 2.4
- Std. Dev. ? (2.4)(1/2) 1.549.
- Could also compute probabilities such as
- P(X 3) 10C3(.4)3(.6)10-3 (120)(.064)(.028)
0.21504. - P(X lt 1) 10C0(.4)0(.6)10 10C1(.4)1(.6)9
0.0060 0.0403 0.0463. - Such cumulative probabilities could be obtained
using binomial tables or Minitab.
15The Probability Function and the Cumulative
Probabilities (obtained using Minitab)
16Graph of the Probability Function
17Binomial Examples continued
- Example 2 continued In this situation, the
variable X which is the number of diseased
individuals in a sample of size 100 is binomial
with n 100 and p 0.05. Its probability
function is therefore
18Graph of the Probability Function of a
BIN(100,0.05) Variable
19Graph of the Cumulative Distribution Function for
BIN(100,0.05)
20Example continued
- From the graph of Bin(100,0.05), notice even
though it is very right-skewed, when we focus on
the shape for the small values (from 0 to 10),
the shape is approximately mound-shaped. This is
a manifestation of the normal approximation to
the binomial. - For this X ? BIN(100,0.05)
- Mean ? (100)(0.05) 5
- Variance ?2 (100)(0.05)(0.95) 4.75
- Standard Deviation ? (4.75)(1/2) 2.18
21Probabilities in Certain Intervals about the Mean
(Probs were calculated using computer)
- P? - ? lt X lt ? - ? P5 - 2.18 lt X lt 5 2.18
P2.82 lt X lt 7.18 p(3) p(4) p(5) p(6)
p(7) .1396 .1781 .1800 .1500 .1060
0.7537 - P? - 2? lt X lt ? - 2? P5 - 2(2.18) lt X lt 5
2(2.18) P0.64 lt X lt 9.36 p(1) p(2)
p(3) p(4) p(5) p(6) p(7) p(8) p(9)
0.9659 - P? - 3? lt X lt ? - 3? P5 - 3(2.18) lt X lt 5
3(2.18) P-1.54 lt X lt 11.54 p(0) p(1)
p(2) p(3) p(4) p(5) p(6) p(7) p(8)
p(9) p(10) p(11) 0.9957. - Compare the agreement of these values with what
are expected under the empirical rule. Quite
Good!
22Using the Binomial Tables
- The binomial tables in the back of the book
provide the probabilities for certain values of n
and p. - It provides p(k) PX k for k 0, 1, 2, ,
n. - To compute Pa lt X lt b we use
- Pa lt X lt b p(a1) p(b), provided that
the values of n and p are in the table. - Otherwise, computer programs (Minitab) or
calculators could be used to calculate the
probabilities.
23Using the Binomial Tables
- Example Assume that the proportion of all adult
Americans who does not approve of President
Bushs handling of the current crisis is 0.20.
Suppose that a sample of size n 10 is taken
from this population, and we let X denote the
number in this sample who does not approve of
Bushs handling of the crisis.
24Example continued
- Since the population sampled is so large and
sampling is random, the variable X will be
binomial with parameters n 20 and p 0.20. - ? 20(.2) 4 and ? 20(.2)(.8)(1/2)
(3.2)(1/2) 1.79. - Using the binomial table with n 20 and p
0.20, we also obtain PX 3 .2054. - Also, PX lt 3 p(0) p(1) p(2) p(3)
.0115 .0576 .1369 .2054 .4114. - If X is BIN(n,p) with p gt .5, then Y n - X is
BIN(n,1-p), so convert problem into Y.
25From Binomial to Poisson
- A box contains many, many balls some of which are
red and others blue. Let p be the proportion of
red balls in the box. Suppose we draw n balls
(with replacement) from this box, and X denotes
the number of red balls in the sample. - Then, from what we have studied so far, X has a
binomial distribution with parameters n and p.
That is, - p(x) (nCx)pxqn-x, x0,1,2,,n.
26Continued ...
- Imagine the situation where we increase the
sample size n but at the same time decrease the
proportion of red balls in the box such that the
mean ? np remains equal to some constant ?,
that is, ? np ?. - Notice that as we increase the value of n, then
the set of possible values of X also becomes
larger. - Clearly, the distribution of X as n increases
still remains Binomial(n,p). - But the question is
- What happens to the binomial(n,p) when we let n
approach infinity with the constraint np ??
27Here comes the Poisson!
- It turns out that under such a situation, we
obtain a new distribution, which is a limiting
distribution of the binomial. - This new distribution, called the Poisson
distribution, is usually an excellent model for
modeling rare events. - The relation with the binomial is as follows
28Poisson Distribution
- A discrete random variable X taking values in the
set 0, 1, 2, 3, is said to have a Poisson
distribution with parameter or intensity rate ? gt
0, and written X ? POI(?), if its probability
function is given by the formula
29Some Properties of the Poisson Distribution
- Provides an approximation to the binomial
probabilities when the parameters in BIN(n,p) are
such that p is small and n is large. The Poisson
approximation is the one with intensity ? np. - It is a model for counting the occurrence of
rare events, such as the occurrence of
epidemics, getting cancer, accidents, terrorist
acts, etc. - For X ? POI(?), the mean of X is ? ? and at the
same time, the variance is ?2 ?. These results
can be seen by recalling the mean and variance
for the binomial distribution.
30Applications of the Poisson
- Problem Suppose that the proportion in the
population who are infected with a deadly virus
is 0.0005. If 10000 people are chosen at random,
what is the probability that exactly 4 out of
these 10000 people will be infected? - Solution 1 Since X ? BIN(10000,.0005), we could
calculate the probability by making use of the
binomial probability to obtain - PX 4 (10000C4)(.0005)4(.9995)9996
0.1754936.
31Solution continued
- Solution 2 Since n 10000 is very large, and p
.0005 is very small, we could also approximate
this probability by using the Poisson
distribution with parameter or intensity rate ?
np (10000)(.0005) 5. Consequently, the
approximate probability is - PX 4 ? (e-5)54/4! 0.175467.
- Comparing this approximate value with the exact
value obtained from the binomial, notice that it
is very good.
32Graph of the Probability Function of a
Poisson(?2) Distribution
33Sampling Without Replacement
- Let there be a group consisting of N objects, K
of which are of Type I and (N-K) of Type II. - Sample n objects, with replacement, and let X
denote the number of Type I objects, then X has a
binomial distribution with parameters n and p
K/N. - In this case, the mean is ? n(K/N) and the
variance is ?2 n(K/N)(1 - K/N). - What happens to the distribution of X when
sampling is without replacement?
34Hypergeometric Distribution
- When sampling is without replacement, then the
probability function of X, the number of Type I
objects in the sample, is - p(x) (KCx)(N-KCn-x)/(NCn), x0,1,2,,n.
- This is called the hypergeometric distribution.
- Mean of X ? n(K/N), same as in binomial.
- Var(X) ?2 n(K/N)(1-K/N)(N-n)/(N-1).
- This distribution is approximately BIN(n,K/N)
when N is large compared to n.