Title: What Is a Confidence Interval
1Chapter 21
- What Is a Confidence Interval?
2Recall from previous chapters
- Parameter
- fixed, unknown number that describes the
population - Statistic
- known value calculated from a sample
- a statistic is used to estimate a parameter
- Sampling Variability
- different samples from the same population may
yield different values of the sample statistic - estimates from samples will be closer to the true
values in the population if the samples are larger
3Recall from previous chapters
- Sampling Distribution
- tells what values a statistic takes and how often
it takes those values in repeated sampling.
4Overview
This chapter presents the beginning of
inferential statistics.
- The two major applications of inferential
statistics involve the use of sample data to (1)
estimate the value of a population parameter, and
(2) test some claim (or hypothesis) about a
population. - We introduce methods for estimating values of
these important population parameters
proportions and means. - We also present methods for determining sample
sizes necessary to estimate those parameters.
5Definition
- A point estimate is a single value (or point)
used to approximate a population parameter.
6Definition
- The sample proportion p is the best point
estimate of the population proportion p.
7Sampling Distribution of Sample Proportions
If numerous simple random samples of size n are
taken from the same population, the sample
proportions from the various samples will
have an approximately normal distribution. The
mean of the sample proportions will be p (the
true population proportion). The standard
deviation will be
8Rule Conditions
- For the sampling distribution of the sample
proportions - to be valid, we must have
- Random samples
- Large sample size
9Definition
- A confidence interval (or interval estimate) is
a range (or an interval) of values used to
estimate the true value of a population
parameter. A confidence interval is sometimes
abbreviated as CI.
10Confidence Interval for a Population Proportion
- An interval of values, computed from sample data,
that is almost sure to cover the true population
proportion. - We are highly confident that the true
population proportion is contained in the
calculated interval. - Statistically (for a 95 C.I.) in repeated
samples, 95 of the calculated confidence
intervals should contain the true proportion.
11(No Transcript)
12Formula for a 95 Confidence Interval for the
Population Proportion (Empirical Rule)
- sample proportion plus or minus two standard
deviations ofthe sample proportion
- since we do not know the population proportion p
(needed to calculate the standard deviation) we
will use the sample proportion in its place.
13Formula for a 95 Confidence Interval for the
Population Proportion (Empirical Rule)
standard error (estimated standard deviation of
)
14Margin of Error
(plus or minus part of C.I.)
15Formula for a C-level () Confidence Interval for
the Population Proportion
where z is the critical value of the standard
normal distribution for confidence level C
16(No Transcript)
17Table 21.1 Common Values of z
18Example 829 adult Nanaimo residents were
surveyed, and 51 of them are opposed to the use
of photo radars for issuing traffic tickets.
Using the survey results
- a) Find the margin of error E that corresponds
to a 95 confidence level. - b) Find the 95 confidence interval estimate of
the population proportion p. - c) Based on the results, can we safely conclude
that the majority of adult Nanaimo residents
oppose use photo radars?
19Example 829 adult Nanaimo residents were
surveyed, and 51 of them are opposed to the use
of the photo radar for issuing traffic tickets.
Using the survey results
- a) Find the margin of error E that corresponds to
a 95 confidence level.
Next, we calculate the margin of error. We have
found that p 0.51, q 1 0.51 0.49, z
1.96, and n 829. E 1.96
(0.51)(0.49)
829
E 0.03403
20Example 829 adult Nanaimo residents were
surveyed, and 51 of them are opposed to the use
of the photo radars for issuing traffic tickets.
Using the survey results
- b) Find the 95 confidence interval for the
population proportion p.
We substitute our values from part a) to obtain
0.51 0.03403 lt p lt 0.51 0.03403, 0.476 lt p lt
0.544
21Example 829 adult Nanaimo residents were
surveyed, and 51 of them are opposed to the use
of the photo radars for issuing traffic tickets.
Use these survey results.
- c) Based on the results, can we safely conclude
that the majority of adult Nanaimo residents
oppose use of the photo radars?
Based on the survey results, we are 95 confident
that the limits of 47.6 and 54.4 contain the
true percentage of adult Nanaimo residents
opposed to the photo radar. The percentage of
adult Nanaimo residents who oppose the use of
photo radars is likely to be any value between
47.6 and 54.4. However, a majority requires a
percentage greater than 50, so we cannot safely
conclude that the majority is opposed (because
the entire confidence interval is not greater
than 50).
22Example Page 444 21.20
- A random sample of 1500 adults finds that 60
favour - balancing the federal budget over cutting Taxes.
Use - this poll result and Table 21.1 to give 70, 80,
90, - and 99 confidence intervals for the proportion
of all - adults who fell this way. What do your results
show - about the effect of changing the confidence level?
23Sample Size
Suppose we want to collect sample data with the
objective of estimating some population
proportion. The question is how many sample
items must be obtained? The required sample size
depends on the confidence level and the desired
margin of error.
24Determining Sample Size
Always round up.
25Sample Size for Estimating Proportion p
- the worst case value
26Example Suppose a sociologist wants to determine
the current percentage of Nanaimo households
using e-mail. How many households must be
surveyed in order to be 95 confident that the
sample percentage is in error by no more than
four percentage points?
- a) Use this result from an earlier study In
1997, 16.9 - of Nanaimo households used e-mail.
- b) Assume that we have no prior information
- suggesting a possible value of p.
27Example Suppose a sociologist wants to determine
the current percentage of Nanaimo households
using e-mail. How many households must be
surveyed in order to be 95 confident that the
sample percentage is in error by no more than
four percentage points?
- a) Use this result from an earlier study In
1997, 16.9 of Nanaimo households used e-mail.
28Example Suppose a sociologist wants to determine
the current percentage of Nanaimo households
using e-mail. How many households must be
surveyed in order to be 95 confident that the
sample percentage is in error by no more than
four percentage points?
- b) Assume that we have no prior information
suggesting a possible value of p.
With no prior information, we need a larger
sample to achieve the same results with 95
confidence and an error of no more than 4.
(1.96)2 (0.25)
0.042
600.25 601 households
29Key Concepts (1st half of Ch. 21)
- Different samples (of the same size) will
generally give different results. - We can specify what these results look like in
the aggregate. - Rule for Sample Proportions
- Compute and interpret confidence intervals for
population proportions based on sample proportions
30Inference for Population MeansSampling
Distribution, Confidence Intervals
- The remainder of this chapter discusses the
situation when interest is in making conclusions
about population means rather than population
proportions - includes the rule for the sampling distribution
of sample means ( ) - includes confidence intervals for a mean
31Thought Question 1(from Seeing Through
Statistics, 2nd Edition, by Jessica M. Utts, p.
316)
- Suppose the mean weight of all women at a
university is 135 pounds, with a standard
deviation of 10 pounds. - Recalling the material from Chapter 13 about
bell-shaped curves, in what range would you
expect 95 of the womens weights to fall?
115 to 155 pounds
32Thought Question 1 (cont.)
- If you were to randomly sample 10 women at the
university, how close do you think their average
weight would be to 135 pounds? - If you randomly sample 1000 women, would you
expect the average to be closer to 135 pounds
than it would be for the sample of 10 women?
33Thought Question 2
A study compared the serum HDL cholesterol levels
in people with low-fat diets to people with diets
high in fat intake. From the study, a 95
confidence interval for the mean HDL cholesterol
for the low-fat group extends from 43.5 to
50.5...
a. Does this mean that 95 of all people with
low-fat diets will have HDL cholesterol levels
between 43.5 and 50.5? Explain.
34Thought Question 2 (cont.)
a 95 confidence interval for the mean HDL
cholesterol for the low-fat group extends from
43.5 to 50.5. A 95 confidence interval for the
mean HDL cholesterol for the high-fat group
extends from 54.5 to 61.5.
b. Based on these results, would you conclude
that people with low-fat diets have lower HDL
cholesterol levels, on average, than people with
high-fat diets?
35Thought Question 3
The first confidence interval in Question 2 was
based on results from 50 people. The confidence
interval spans a range of 7 units. If the
results had been based on a much larger sample,
would the confidence interval for the mean
cholesterol level have been wider, more narrow,
or about the same? Explain.
36The Central Limit Theorem (CLT)
If simple random samples of size n (n large) are
taken from the same population, the sample means
from the various samples will have an
approximately normal distribution. The mean of
the sample means will be m (the population
mean). The standard deviation will be
(? is the population s.d.)
37(No Transcript)
38Conditions for the Rule for Sample Means
- Random sample
- Population of measurements
- Follows a bell-shaped curve
- - or -
- Not bell-shaped, but sample size is large
- When is not known, we use the sample
standard deviation to estimate . In this case,
we use a t - distribution to find the critical
value if the underlying distribution is normal.
39Case Study Weights Sampling Distribution(for n
10)
40Case Study Weights Answer to Question(for n
10)
- Where should 95 of the sample mean weights fall
(from samples of size n10)? - mean plus or minus two standard deviations
- 135 ? 2(3.16) 128.68
- 135 2(3.16) 141.32
- 95 should fall between 128.68 lb 141.32 lb
41Case Study Weights Sampling Distribution(for n
25)
42Case Study Weights Answer to Question(for n
25)
- Where should 95 of the sample mean weights fall
(from samples of size n25)? - mean plus or minus two standard deviations
- 135 ? 2(2) 131
- 135 2(2) 139
- 95 should fall between 131 lb 139 lb
43Sampling Distribution of Mean (n25)
44Case Study Weights Sampling Distribution(for n
100)
45Case Study Weights Answer to Question(for n
100)
- Where should 95 of the sample mean weights fall
(from samples of size n100)? - mean plus or minus two standard deviations
- 135 ? 2(1) 133
- 135 2(1) 137
- 95 should fall between 133 lb 137 lb
46Sampling Distribution of Mean (n100)
47Case Study
Exercise and Pulse Rates
Hypothetical
Is the mean resting pulse rate of adult subjects
who regularly exercise different from the mean
resting pulse rate of those who do not regularly
exercise?
Find Confidence Intervals for the means
48Case Study Results
Exercise and Pulse Rates
A random sample of n129 exercisers yielded a
sample mean of 66 beats per minute (bpm)
with a sample standard deviation of s18.6 bpm.
A random sample of n231 nonexercisers yielded a
sample mean of 75 bpm with a sample
standard deviation of s29.0 bpm.
49The Rule for Sample Means
We do not know the value of ? ! We assume that
the pulse rates are normally distributed. We
need to use the t distribution to find the
critical value. For large samples, we can use
the normal distribution as an approximation.
50Standard Error of the (Sample) Mean
- SEM standard error of the mean
- (standard deviation from the sample)
- divided by
- (square root of the sample size)
-
51Case Study Results
Exercise and Pulse Rates
- Typical deviation of an individual pulse
rate(for Exercisers) is s 8.6 - Typical deviation of a mean pulse rate(for
Exercisers) is
1.6
52Case Study Confidence Intervals
Exercise and Pulse Rates
- 95 C.I. for the population mean
- sample mean z (standard error)
- Exercisers 66 2(1.6) 66 3.2 (62.8,
69.2) - Non-exercisers 75 2(1.6) 75 3.2 (71.8,
78.2) - Do you think the population means are different?
Yes, because the intervals do not overlap
53Careful Interpretation of a Confidence Interval
- We are 95 confident that the mean resting pulse
rate for the population of all exercisers is
between 62.8 and 69.2 bpm. (We feel that
plausible values for the population of
exercisers mean resting pulse rate are between
62.8 and 69.2.) - This does not mean that 95 of all people who
exercise regularly will have resting pulse rates
between 62.8 and 69.2 bpm. - Statistically 95 of all samples of size 29 from
the population of exercisers should yield a
sample mean within two standard errors of the
population mean i.e., in repeated samples, 95
of the C.I.s should contain the true population
mean.
54Case Study Confidence Intervals
Exercise and Pulse Rates
- 95 C.I. for the difference in population means
(nonexercisers minus exercisers) - (difference in sample means)
2 (SE of the difference)
- Difference in sample means 9
- SE of the difference 2.26 (given)
- 95 confidence interval (4.4, 13.6)
- interval does not include zero (? means are
different)
55Example Page 445 21.26
- The NAEP test (Example 7, page 439) was given to
a sample - of 1077 women of ages 21 to 25 years. Their mean
quantitative - score was 275 and the standard deviation was 58.
- Give a 95 confidence interval for the mean score
µ in the population of all young women.
56Example Page 445 21.26 Cont.
- The NAEP test (Example 7, page 439) was given to
a sample - of 1077 women of ages 21 to 25 years. Their mean
quantitative - score was 275 and the standard deviation was 58.
- Give the 90 and 99 confidence intervals for µ.
57Example Page 445 21.26 Cont.
- The NAEP test (Example 7, page 439) was given to
a sample - of 1077 women of ages 21 to 25 years. Their mean
quantitative - score was 275 and the standard deviation was 58.
- What are the margins of error for 90, 95, and
99 confidence? How does increasing the
confidence level affect the margin of error of a
confidence interval?
58Sample Size for Estimating Mean ?
Where Z critical z score based on the desired
confidence level E desired margin of error s
population standard deviation
59Example Assume that we want to estimate the
mean IQ score for the population of statistics
professors. How many statistics professors must
be randomly selected for IQ tests if we want 95
confidence that the sample mean is within 2 IQ
points of the population mean? Assume that ?
15, as is found in the general population.
Z 1.96 E 2 ? 15
With a simple random sample of only 217
statistics professors, we will be 95 confident
that the sample mean will be within 2 IQ points
of the true population mean ?.
60Key Concepts (2nd half of Ch. 21)
- Sampling distribution of sample Means
- The Central Limit Theorem
- Compute confidence intervals for means
- Interpret Confidence Intervals for Means