Supplement 12: FinitePopulation Correction Factor

About This Presentation

Title:

Supplement 12: FinitePopulation Correction Factor

Description:

The data collected are the results of counts in a series of trials. ... There are only 2 possible outcomes. The probability of a success is not the same on each trial. ... – PowerPoint PPT presentation

Number of Views:101

Avg rating:3.0/5.0

Slides: 16

Provided by: keke8

Category:

more less

Transcript and Presenter's Notes

Title: Supplement 12: FinitePopulation Correction Factor

1
Supplement 12Finite-Population Correction Factor

Ka-fu WONG Yi KE

The ppt is a joint effort Mr KE Yi discussed
whether we should make the finite population
correction with Dr. Ka-fu Wong on 19 April 2007
Ka-fu explained the problem Yi drafted the ppt
Ka-fu revised it. Use it at your own risks.
Comments, if any, should be sent to
kafuwong_at_econ.hku.hk.
2
Binomial distribution

The binomial distribution has the following
characteristics
An outcome of an experiment is classified into
one of two mutually exclusive categories, such as
a success or failure.
The data collected are the results of counts in a
series of trials.
The probability of success stays the same for
each trial.
The trials are independent.
For example, tossing an unfair coin three times.
H is labeled success and T is labeled failure.
The data collected are number of H in the three
tosses.
The probability of H stays the same for each
toss.
The results of the tosses are independent.

3
Hypergeometric Distribution

The hypergeometric distribution has the following
characteristics
There are only 2 possible outcomes.
The probability of a success is not the same on
each trial.
It results from a count of the number of
successes in a fixed number of trials.

4
Example Hypergeometric Distribution
In a bag containing 7 red chips and 5 blue chips
you select 2 chips one after the other without
replacement.
R2
6/11
R1
7/12
B2
5/11
R2
7/11
B1
5/12
B2
4/11
The probability of a success (red chip) is not
the same on each trial.
5
Binomial vs. hypergeometric
The formula for the binomial probability
distribution is P(x) C(n,x) px(1- p)n-x
The formula for the hypergeometric probability
distribution is P(x) C(S,x)C(N-S,n-x)/C(N,n)
6
Example Hypergeometric Distribution
In a bag containing 700 red chips and 500 blue
chips you select 2 chips one after the other
without replacement.
R2
699/1199
R1
700/1200
500/1199
B2
R2
700/1199
B1
500/1200
B2
499/1199
The probability of a success (red chip) is not
the same on each trial but the difference is very
small. So small that our conclusion will not be
affected much even if we ignore the difference in
probability in the two trials.
7
Example Hypergeometric Distribution
In a bag containing 700 red chips and 500 blue
chips you select 2 chips one after the other
without replacement.
The probability of a success (red chip) is not
the same on each trial but the difference is very
small. So small that our conclusion will not be
affected even if we ignore the difference in
probability in the two trials.
Knowing when we can ignore the difference is
important when
1. we have limited computation power.
2. N is unknown but we know that N is somewhat
large relative to sample size.
8
Finite? Infinite?

Actually, most population we are working with is
finite
Number of goods produced in a workshop per hour
The number of times you visit a club per month
Even the population of a big city, say Shanghai

ALL FINITE!!
9
Cost and benefits

Would it be harmful to make correction every time
you came across a finite population?
No harm!
But the correction can be costly
Need to collect / know the population size.
Need additional computational power.
Need to remember one additional formula
Whether to recognize the finite population (and
hence whether to use a different formula) depends
on the benefits and costs of doing so.

10
Cost and benefits

Whether to recognize the finite population (and
hence whether to use a different formula) depends
on the benefits and costs of doing so.
Benefits increases with the proportion of sample
size to population size.
Cost is higher when
We do not have a computer with us.
When we need to make extra effort to find out N.

Use an approximation instead of the exact formula
if
Cost of using exact formula gt benefit
11
Finite? Infinite?

Actually, most population we are working with is
finite
Number of goods produced in a workshop per hour
The number of times you visit a club per month
Even the population of a big city, say Shanghai

ALL FINITE!!
But do we know N ?
12
An Example benefits decreases when n/N decreases

A population of size N500 (finite), ? is known
When the size of sample n300
With finite population correction the standard
error of the sample is
Why? 0.633 make a big difference.

The benefit of recognizing the finite population
(in terms of the conclusion of testing hypothesis
and constructing CI) appears big.
13
An Example benefits decreases when n/N decreases

A population of size N500 (finite), ? is known
When the size of sample n5
With finite population correction the standard
error of the sample is
Why? 0.996 make a small difference

The benefit of recognizing the finite population
(in terms of the conclusion of testing hypothesis
and constructing CI) appears small.
14
N relative to n

1st case, 500 is relative small when the sample
size is 300, correction is needed
2nd case, 500 is relative large when the sample
size is 5, correction can be ignored (a finite
population can be treated as infinite)
Notes
Usually, If n/N lt .05, the finite-population
correction factor is ignored
This applies to sample proportions as well