Commonly-Used Discrete Distributions - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

Commonly-Used Discrete Distributions

Description:

... in the back of the book provide the probabilities for certain values of n and p. ... A box contains many, many balls some of which are red and others blue. ... – PowerPoint PPT presentation

Number of Views:24

Avg rating:3.0/5.0

Slides: 35

Provided by: edsel

Learn more at: https://people.stat.sc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Commonly-Used Discrete Distributions

1
Commonly-Used Discrete Distributions

Stat 700 Lecture 06
9/25/2001

2
Overview of Lecture

Commonly-used discrete distributions and when
they are appropriate
Discrete Uniform
Bernoulli
Binomial
Hypergeometric
Poisson

3
Commonly-Used Discrete Distributions

A discrete random variable U is uniformly
distributed over the set 1, 2, 3, , N, denoted
U ? UNIF1,2,,N, where N is a positive integer,
if its probability function is given by
p(u) 1/N for u 1, 2, 3, , N.
This is the model which assigns equal
probabilities to the possible values of U.
Applicability Random allocation

4
Uniform continued

For U ? UNIF1,2,,N
Mean ? (N 1)/2.
Variance ?2 (N2 - 1)/12.
For example, suppose the number of students in
Stat 700 who will be taking Stat 701 is uniformly
distributed over 1,2,,12. Then N 12, so
? (12 1)/2 6.5
?2 (122 - 1)/12 143/12 11.92.

5
Bernoulli Trials

An experiment is a Bernoulli trial if its
outcomes could be classified into two categories
Success (S) or Failure (F) we could also use
other labels such as Good or Defective,
Female or Male, etc..
We denote by p the probability of a Success and
by q 1 - p the probability of a Failure.
If X 1 denotes Success and X 0 denotes
Failure, X is called a Bernoulli random
variable.
Its probability function is p(0) q and p(1)
p.

6
Examples of Bernoulli Trials

Observing the outcome of a surgery Success or
Failure. How to determine p?
Observing whether for an insured person the event
insured against occurs or not. Again, what will
be p?
Observing whether a machine will function for a
specified period.
Observing whether a genetic mutation has
occurred.
Determining if a biological organism will survive
for a specified time.
Observing whether it rains or not for a given day.

7
Mean and Variance for a Bernoulli Random Variable

If X has a Bernoulli distribution with success
probability p, we write X ? BER(p).
For X ? BER(p)
Mean ? p since ? (0)(q) (1)(p) p.
Variance ?2 pq since by the definitional
formula,
?2 (0 - p)2(q) (1 - p)2(p) p2q q2p
pq(pq) pq because p q 1.

8
Binomial Experiments and Variables

Consider now an experiment with the following
characteristics
a) it consists of n Bernoulli trials n
replications
b) each of the n Bernoulli trials has
probability p of Success constant probability
of success
c) the n Bernoulli trials are independent.
trials dont affect each other
Then such an experiment is called a binomial
experiment with parameters n and p.
If X is the random variable that counts the
number of successes out of the n trials, then X
is said to be a binomial random variable with
parameters n and p.

9
Examples of Binomial Expts/Variables

Example 1 Draw a sample of size n 10, with
replacement, from a box containing 20 red and 30
blue balls, and let X denote the number of red
balls in the sample.
Note the following
success is getting a red
n 10 (number of draws)
p 20/50 .4 (probability of red per trial)
independent trials hold because sampling is with
replacement
X counts the number of reds (successes)
X is binomial with n 10 and p .4.

10
Examples continued

Example 2 A very large population of people is
such that 5 have a certain type of disease, and
95 does not have the disease. A sample of size
n 100 is drawn from this population (without
replacement), and X denotes the number of people
in the sample who has the disease.
Note that
trial is picking a person and determining if
he/she has the disease (success)
independence is approximately satisfied because
of the large size of the population
p 0.05 (approximately) for each of the trials
Therefore, X is (approximately) binomial with n
100 and p 0.05.

11
Probability Function of a Binomial Random Variable

Let X be a binomial random variable with
parameters n and p, denoted X ? BIN(n,p). Then,
the range of X is R 0,1,2,,n.
Its probability function is given by recall that
q 1 - p

12
Parameters of Binomial Distribution

For X a binomial random variable with parameters
n and p, its mean, variance, and standard
deviation are given by

13
Applications of the Binomial Distribution

Example 1 continued In this example, X, which
counts the number of red balls in the sample of
size 10 is binomial with n 10 and p 0.40.
Therefore, the probability function is given by

14
Example 1 continued

Mean ? np 10(.4) 4
Variance ?2 npq (10)(.4)(.6) 2.4
Std. Dev. ? (2.4)(1/2) 1.549.
Could also compute probabilities such as
P(X 3) 10C3(.4)3(.6)10-3 (120)(.064)(.028)
0.21504.
P(X lt 1) 10C0(.4)0(.6)10 10C1(.4)1(.6)9
0.0060 0.0403 0.0463.
Such cumulative probabilities could be obtained
using binomial tables or Minitab.

15
The Probability Function and the Cumulative
Probabilities (obtained using Minitab)
16
Graph of the Probability Function
17
Binomial Examples continued

Example 2 continued In this situation, the
variable X which is the number of diseased
individuals in a sample of size 100 is binomial
with n 100 and p 0.05. Its probability
function is therefore

18
Graph of the Probability Function of a
BIN(100,0.05) Variable
19
Graph of the Cumulative Distribution Function for
BIN(100,0.05)
20
Example continued

From the graph of Bin(100,0.05), notice even
though it is very right-skewed, when we focus on
the shape for the small values (from 0 to 10),
the shape is approximately mound-shaped. This is
a manifestation of the normal approximation to
the binomial.
For this X ? BIN(100,0.05)
Mean ? (100)(0.05) 5
Variance ?2 (100)(0.05)(0.95) 4.75
Standard Deviation ? (4.75)(1/2) 2.18

21
Probabilities in Certain Intervals about the Mean
(Probs were calculated using computer)

P? - ? lt X lt ? - ? P5 - 2.18 lt X lt 5 2.18
P2.82 lt X lt 7.18 p(3) p(4) p(5) p(6)
p(7) .1396 .1781 .1800 .1500 .1060
0.7537
P? - 2? lt X lt ? - 2? P5 - 2(2.18) lt X lt 5
2(2.18) P0.64 lt X lt 9.36 p(1) p(2)
p(3) p(4) p(5) p(6) p(7) p(8) p(9)
0.9659
P? - 3? lt X lt ? - 3? P5 - 3(2.18) lt X lt 5
3(2.18) P-1.54 lt X lt 11.54 p(0) p(1)
p(2) p(3) p(4) p(5) p(6) p(7) p(8)
p(9) p(10) p(11) 0.9957.
Compare the agreement of these values with what
are expected under the empirical rule. Quite
Good!

22
Using the Binomial Tables

The binomial tables in the back of the book
provide the probabilities for certain values of n
and p.
It provides p(k) PX k for k 0, 1, 2, ,
n.
To compute Pa lt X lt b we use
Pa lt X lt b p(a1) p(b), provided that
the values of n and p are in the table.
Otherwise, computer programs (Minitab) or
calculators could be used to calculate the
probabilities.

23
Using the Binomial Tables

Example Assume that the proportion of all adult
Americans who does not approve of President
Bushs handling of the current crisis is 0.20.
Suppose that a sample of size n 10 is taken
from this population, and we let X denote the
number in this sample who does not approve of
Bushs handling of the crisis.

24
Example continued

Since the population sampled is so large and
sampling is random, the variable X will be
binomial with parameters n 20 and p 0.20.
? 20(.2) 4 and ? 20(.2)(.8)(1/2)
(3.2)(1/2) 1.79.
Using the binomial table with n 20 and p
0.20, we also obtain PX 3 .2054.
Also, PX lt 3 p(0) p(1) p(2) p(3)
.0115 .0576 .1369 .2054 .4114.
If X is BIN(n,p) with p gt .5, then Y n - X is
BIN(n,1-p), so convert problem into Y.

25
From Binomial to Poisson

A box contains many, many balls some of which are
red and others blue. Let p be the proportion of
red balls in the box. Suppose we draw n balls
(with replacement) from this box, and X denotes
the number of red balls in the sample.
Then, from what we have studied so far, X has a
binomial distribution with parameters n and p.
That is,
p(x) (nCx)pxqn-x, x0,1,2,,n.

26
Continued ...

Imagine the situation where we increase the
sample size n but at the same time decrease the
proportion of red balls in the box such that the
mean ? np remains equal to some constant ?,
that is, ? np ?.
Notice that as we increase the value of n, then
the set of possible values of X also becomes
larger.
Clearly, the distribution of X as n increases
still remains Binomial(n,p).
But the question is
What happens to the binomial(n,p) when we let n
approach infinity with the constraint np ??

27
Here comes the Poisson!

It turns out that under such a situation, we
obtain a new distribution, which is a limiting
distribution of the binomial.
This new distribution, called the Poisson
distribution, is usually an excellent model for
modeling rare events.
The relation with the binomial is as follows

28
Poisson Distribution

A discrete random variable X taking values in the
set 0, 1, 2, 3, is said to have a Poisson
distribution with parameter or intensity rate ? gt
0, and written X ? POI(?), if its probability
function is given by the formula

29
Some Properties of the Poisson Distribution

Provides an approximation to the binomial
probabilities when the parameters in BIN(n,p) are
such that p is small and n is large. The Poisson
approximation is the one with intensity ? np.
It is a model for counting the occurrence of
rare events, such as the occurrence of
epidemics, getting cancer, accidents, terrorist
acts, etc.
For X ? POI(?), the mean of X is ? ? and at the
same time, the variance is ?2 ?. These results
can be seen by recalling the mean and variance
for the binomial distribution.

30
Applications of the Poisson

Problem Suppose that the proportion in the
population who are infected with a deadly virus
is 0.0005. If 10000 people are chosen at random,
what is the probability that exactly 4 out of
these 10000 people will be infected?
Solution 1 Since X ? BIN(10000,.0005), we could
calculate the probability by making use of the
binomial probability to obtain
PX 4 (10000C4)(.0005)4(.9995)9996
0.1754936.

31
Solution continued

Solution 2 Since n 10000 is very large, and p
.0005 is very small, we could also approximate
this probability by using the Poisson
distribution with parameter or intensity rate ?
np (10000)(.0005) 5. Consequently, the
approximate probability is
PX 4 ? (e-5)54/4! 0.175467.
Comparing this approximate value with the exact
value obtained from the binomial, notice that it
is very good.

32
Graph of the Probability Function of a
Poisson(?2) Distribution
33
Sampling Without Replacement

Let there be a group consisting of N objects, K
of which are of Type I and (N-K) of Type II.
Sample n objects, with replacement, and let X
denote the number of Type I objects, then X has a
binomial distribution with parameters n and p
K/N.
In this case, the mean is ? n(K/N) and the
variance is ?2 n(K/N)(1 - K/N).
What happens to the distribution of X when
sampling is without replacement?

34
Hypergeometric Distribution

When sampling is without replacement, then the
probability function of X, the number of Type I
objects in the sample, is
p(x) (KCx)(N-KCn-x)/(NCn), x0,1,2,,n.
This is called the hypergeometric distribution.
Mean of X ? n(K/N), same as in binomial.
Var(X) ?2 n(K/N)(1-K/N)(N-n)/(N-1).
This distribution is approximately BIN(n,K/N)
when N is large compared to n.

Write a Comment

User Comments (0)