Title: How Likely Are the Possible Values of a Statistic?
1Section 6.4
- How Likely Are the Possible Values of a
Statistic? - The Sampling Distribution
2Statistic
- Recall A statistic is a numerical summary of
sample data, such as a sample proportion or a
sample mean.
3Parameter
- Recall A parameter is a numerical summary of a
population, such as a population proportion or a
population mean.
4Statistics and Parameters
- In practice, we seldom know the values of
parameters. - Parameters are estimated using sample data.
- We use statistics to estimate parameters.
5Example 2003 California Recall Election
- Prior to counting the votes, the proportion in
favor of recalling Governor Gray Davis was an
unknown parameter. - An exit poll of 3160 voters reported that the
sample proportion in favor of a recall was 0.54.
6Example 2003 California Recall Election
- If a different random sample of about 3000 voters
were selected, a different sample proportion
would occur.
7Example 2003 California Recall Election
- Imagine all the distinct samples of 3000 voters
you could possibly get. - Each such sample has a value for the sample
proportion.
8Statistics and Parameters
- How do we know that a sample statistic is a good
estimate of a population parameter? - To answer this, we need to look at a probability
distribution called the sampling distribution.
9Sampling Distribution
- The sampling distribution of a statistic is the
probability distribution that specifies
probabilities for the possible values the
statistic can take.
10The Sampling Distribution of the Sample Proportion
- Look at each possible sample.
- Find the sample proportion for each sample.
- Construct the frequency distribution of the
sample proportion values. - This frequency distribution is the sampling
distribution of the sample proportion.
11Example Sampling Distribution
- Which Brand of Pizza Do You Prefer?
- Two Choices A or D.
- Assume that half of the population prefers Brand
A and half prefers Random D. - Take a random sample of n 3 tasters.
12Example Sampling Distribution
Sample No. Prefer Pizza A Proportion
(A,A,A) 3 1
(A,A,D) 2 2/3
(A,D,A) 2 2/3
(D,A,A) 2 2/3
(A,D,D) 1 1/3
(D,A,D) 1 1/3
(D,D,A) 1 1/3
(D,D,D) 0 0
13Example Sampling Distribution
Sample Proportion Probability
0 1/8
1/3 3/8
2/3 3/8
1 1/8
14Example Sampling Distribution
15Mean and Standard Deviation of the Sampling
Distribution of a Proportion
- For a binomial random variable with n trials and
probability p of success for each, the sampling
distribution of the proportion of successes has - To obtain these value, take the mean np and
standard deviation for the
binomial distribution of the number of successes
and divide by n.
16Example 2003 California Recall Election
- Sample Exit poll of 3160 voters.
- Suppose that exactly 50 of the population of all
voters voted in favor of the recall.
17Example 2003 California Recall Election
- Describe the mean and standard deviation of the
sampling distribution of the number in the sample
who voted in favor of the recall. - µ np 3160(0.50) 1580
-
18Example 2003 California Recall Election
- Describe the mean and standard deviation of the
sampling distribution of the proportion in the
sample who voted in favor of the recall.
BE VERY CAREFUL
19The Standard Error
- To distinguish the standard deviation of a
sampling distribution from the standard deviation
of an ordinary probability distribution, we refer
to it as a standard error.
20Example 2003 California Recall Election
- If the population proportion supporting recall
was 0.50, would it have been unlikely to observe
the exit-poll sample proportion of 0.54? - Based on your answer, would you be willing to
predict that Davis would be recalled from office?
21Example 2003 California Recall Election
- Fact The sampling distribution of the sample
proportion has a bell-shape with a mean µ 0.50
and a standard deviation s 0.0089.
22Example 2003 California Recall Election
- Convert the sample proportion value of 0.54 to a
z-score
23Example 2003 California Recall Election
24Example 2003 California Recall Election
- The sample proportion of 0.54 is more than four
standard errors from the expected value of 0.50. - The sample proportion of 0.54 voting for recall
would be very unlikely if the population support
were p 0.50.
25Example 2003 California Recall Election
- A sample proportion of 0.54 would be even more
unlikely if the population support were less than
0.50. - We there have strong evidence that the population
support was larger than 0.50. - The exit poll gives strong evidence that Governor
Davis would be recalled.
26Example 2003 California Recall Election
- Describe the mean and standard deviation of the
sampling distribution of the proportion in the
sample who voted in favor of the recall.
BE VERY CAREFUL
27Summary of the Sampling Distribution of a
Proportion
- For a random sample of size n from a population
with proportion p, the sampling distribution of
the sample proportion has - If n is sufficiently large such that the expected
numbers of outcomes of the two types, np and
n(1-p), are both at least 15, then this sampling
distribution has a bell-shape.
28Section 6.5
- How Close Are Sample Means to Population Means?
29The Sampling Distribution of the Sample Mean
- The sample mean, x, is a random variable.
- The sample mean varies from sample to sample.
- By contrast, the population mean, µ, is a single
fixed number.
30Mean and Standard Error of the Sampling
Distribution of the Sample Mean
- For a random sample of size n from a population
having mean µ and standard deviation s, the
sampling distribution of the sample mean has - Center described by the mean µ (the same as the
mean of the population). - Spread described by the standard error, which
equals the population standard deviation divided
by the square root of the sample size
31Example How Much Do Mean Sales Vary From Week to
Week?
- Daily sales at a pizza restaurant vary from day
to day. - The sales figures fluctuate around a mean µ
900 with a standard deviation s 300.
32Example How Much Do Mean Sales Vary From Week to
Week?
- The mean sales for the seven days in a week are
computed each week. - The weekly means are plotted over time.
- These weekly means form a sampling distribution.
33Example How Much Do Mean Sales Vary From Week to
Week?
- What are the center and spread of the sampling
distribution?
34Sampling Distribution vs. Population Distribution
35Standard Error
- Knowing how to find a standard error gives us a
mechanism for understanding how much variability
to expect in sample statistics just by chance.
36Standard Error
- The standard error of the sample mean
- As the sample size n increases, the denominator
increase, so the standard error decreases. - With larger samples, the sample mean is more
likely to fall close to the population mean.
37Central Limit Theorem
- Question How does the sampling distribution of
the sample mean relate with respect to shape,
center, and spread to the probability
distribution from which the samples were taken?
38Central Limit Theorem
- For random sampling with a large sample size n,
the sampling distribution of the sample mean is
approximately a normal distribution. - This result applies no matter what the shape of
the probability distribution from which the
samples are taken.
39Central Limit Theorem How Large a Sample?
- The sampling distribution of the sample mean
takes more of a bell shape as the random sample
size n increases. The more skewed the population
distribution, the larger n must be before the
shape of the sampling distribution is close to
normal. In practice, the sampling distribution
is usually close to normal when the sample size n
is at least about 30.
40A Normal Population Distribution and the Sampling
Distribution
- If the population distribution is approximately
normal, then the sampling distribution is
approximately normal for all sample sizes.
41How Does the Central Limit Theorem Help Us Make
Inferences
- For large n, the sampling distribution is
approximately normal even if the population
distribution is not. - This enables us to make inferences about
population means regardless of the shape of the
population distribution.