Title: Sampling Distributions
1Sampling Distributions
- A review by Hieu Nguyen(03/27/06)
2Parameter vs Statistic
- A parameter is a description for the entire
population. - ExampleA parameter for the US population is the
proportion of all people who support President
Bushs nomination of Samuel Alito to the Supreme
Court. - p.74
3Parameter vs Statistic
- A statistic is a description of a sample taken
from the population. It is only an estimate of
the population parameter. - ExampleIn a poll of 1001 Americans, 73 of
those surveyed supported Alitos nomination. - p-hat.73
4Bias
- The bias of a statistic is a measure of its
difference from the population parameter. - A statistic is unbiased if it exactly equals the
population parameter. - ExampleThe poll would have been unbiased if 74
of those surveyed approved of Alitos nomination. - p-hat.74p
5Sampling Variability
- Samples naturally have varying results. The mean
or sample proportion of one sample may be
different from that of another. - In the poll mentioned before p-hat.73.
- A repetition of the same poll may have
p-hat.75.
6Central Limit Theorem (CLT)
- Populations that are wildly skewed may cause
samples to vary a great deal. - However, the CLT states that these samples tend
to have a sample proportion (or mean) that is
close to the population parameter. - The CLT is very similar to the law of large
numbers.
7CLT Example
- Imagine that many polls of 1001 Americans are
done to find the proportion of those who
supported Alitos nomination. - Although the poll results vary, more samples have
a mean that is close to the population parameter
µ.74.
8CLT Example
- Plot the mean of all samples to see the effects
of the CLT. Notice how there are more sample
means near the population parameter µ.74.
- This histogram is actually a sampling
distribution
9Sampling Distributions Definition
- Textbook definitionA sampling distribution is
the distribution of values taken by the statistic
in all possible samples of the same size from the
same population. - In other words, a sampling distribution is a
histogram of the statistics from samples of the
same size of a population.
10Two Most Common Types of Sampling Distributions
- Sample Proportion Distribution
- Distribution of the sample proportions of samples
from a population - Sample Mean Distribution
- Distribution of the sample means of samples from
a population - For both types, the ideal shape is a normal
distribution
11Sampling Distributions Conditions
- Before assuming that a sampling distribution is
normal, check the following conditions - Plausible Independence
- Randomness
- Each sample is less than 10 of the population
12Sampling Distributions As Normal Distributions
- When all conditions met, the sampling
distribution can be considered a normal
distribution with a center and a spread. - NoteWith sample proportion distributions,
another condition must be meet - Success-failure conditon there must be at least
10 success and 10 failures according to the
population parameter and sample size
13Sampling Distributions As Normal Distributions
Equations
- Sample Proportion Distribution
- p population proportion (given)
- Sample Mean Distribution
- µ population mean (given)
- s population standard deviation (given)
14Sampling Distributions As Normal Distributions
Note
- NoteIf any of the parameters are unknown, use
the statistics from a sample to approximate it.
15Using Sampling Distributions
- Sampling Distributions can estimate the
probability of getting a certain statistic in a
random sample. - Use z-scores or the NormalCDF function in the
TI-83/84.
16Using Sampling Distributions Z-Scores w/ Example
- Use the z-score table to find appropriate
probabilities - ExampleFind the probability that a poll of
Americans that support Alitos nomination will
return a sample proportion of .72.
17Using Sampling Distributions NormalCDF Function
w/ Example
- The syntax for the NormalCDF function is
- NormalCDF(lower limit, upper limit, µ, s)
- ExampleFind the probability that a sample of
size 25 will have a mean of 5 given that the
population has a mean of 7 and a standard
deviation of 3.
18Sampling Distribution for Two Populations
- Use a difference sampling distribution if the
question presents 2 different populations.
19Sampling Distribution for Two Populations
Example
- (adapted from AP Statistics Chapter 9
Sampling Distribution Multiple Choice Questions - Medium oranges have a mean weight of 14oz and a
standard deviation of 2oz. Large oranges have a
mean weight of 18oz and a standard deviation of
3oz. Find the probability of finding a medium
orange that weights more than a large orange.
20Example Problem
- (adapted from DeVeau Sampling Distribution Models
Exercise 42) - Ayrshire cows average 47 pounds if milk a day,
with a standard deviation of 6 pounds. For Jersey
cows, the mean daily production is 43 pounds,
with a standard deviation of 5 pounds. Assume
that Normal models describe milk production for
these breeds. - A) We select an Ayrshire at random. Whats the
probability that she averages more than 50 pounds
of milk a day? - B) Whats the probability that a randomly
selected Ayrshire gives more milk than a randomly
selected Jersey? - C) A farmer has 20 Jerseys. Whats the
probability that the average production for this
small herd exceeds 45 pounds of milk a day? - D) A neighboring farmer has 10 Ayrshires. Whats
the probability that his herd average is at least
5 pounds higher than the average for the Jersey
herd?
21Example Problem Solution
- First, check the assumptions
- Independent samples
- Randomness
- Sample represents less than 10 of population
22Example Problem Solution
- A) Use the normal model to estimate the
appropriate probability.
23Example Problem Solution
- B) Create a normal model for the difference
between Ayrshires and Jerseys. Use the model to
estimate the appropriate probability.
24Example Problem Solution
- C) Create a sampling distribution model for which
n20 Jerseys. Use the model to estimate the
appropriate probability.
25Example Problem Solution
- D) First create a sampling distribution model for
10 random Ayrshires and 20 random Jerseys. Then
create a normal model for the difference between
the 10 Ayrshires and 20 Jerseys.