Title: Chapter 7 Statistical Inference: Confidence Intervals
1Chapter 7Statistical Inference Confidence
Intervals
- Learn .
- How to Estimate a Population
- Parameter Using Sample Data
2 Section 7.1
- What Are Point and Interval Estimates of
Population Parameters?
3Point Estimate
- A point estimate is a single number that is our
best guess for the parameter
4Interval Estimate
- An interval estimate is an interval of numbers
within which the parameter value is believed to
fall.
5Point Estimate vs Interval Estimate
6Point Estimate vs Interval Estimate
- A point estimate doesnt tell us how close the
estimate is likely to be to the parameter - An interval estimate is more useful
- It incorporates a margin of error which helps us
to gauge the accuracy of the point estimate
7Point Estimation How Do We Make a Best Guess
for a Population Parameter?
- Use an appropriate sample statistic
- For the population mean, use the sample mean
- For the population proportion, use the sample
proportion
8Point Estimation How Do We Make a Best Guess
for a Population Parameter?
- Point estimates are the most common form of
inference reported by the mass media
9Properties of Point Estimators
- Property 1 A good estimator has a sampling
distribution that is centered at the parameter - An estimator with this property is unbiased
- The sample mean is an unbiased estimator of the
population mean - The sample proportion is an unbiased estimator of
the population proportion
10Properties of Point Estimators
- Property 2 A good estimator has a small
standard error compared to other estimators - This means it tends to fall closer than other
estimates to the parameter
11Interval Estimation Constructing an Interval
that Contains the Parameter (We Hope!)
- Inference about a parameter should provide not
only a point estimate but should also indicate
its likely precision
12Confidence Interval
- A confidence interval is an interval containing
the most believable values for a parameter - The probability that this method produces an
interval that contains the parameter is called
the confidence level - This is a number chosen to be close to 1, most
commonly 0.95
13What is the Logic Behind Constructing a
Confidence Interval?
- To construct a confidence interval for a
population proportion, start with the sampling
distribution of a sample proportion
14The Sampling Distribution of the Sample Proportion
- Gives the possible values for the sample
proportion and their probabilities - Is approximately a normal distribution for large
random samples - Has a mean equal to the population proportion
- Has a standard deviation called the standard
error
15A 95 Confidence Interval for a Population
Proportion
- Fact Approximately 95 of a normal distribution
falls within 1.96 standard deviations of the mean - That means With probability 0.95, the sample
proportion falls within about 1.96 standard
errors of the population proportion
16Margin of Error
- The margin of error measures how accurate the
point estimate is likely to be in estimating a
parameter - The distance of 1.96 standard errors in the
margin of error for a 95 confidence interval
17Confidence Interval
- A confidence interval is constructed by adding
and subtracting a margin of error from a given
point estimate - When the sampling distribution is approximately
normal, a 95 confidence interval has margin of
error equal to 1.96 standard errors
18 Section 7.2
- How Can We Construct a Confidence Interval to
Estimate a Population Proportion?
19Finding the 95 Confidence Interval for a
Population Proportion
- We symbolize a population proportion by p
- The point estimate of the population proportion
is the sample proportion - We symbolize the sample proportion by
20Finding the 95 Confidence Interval for a
Population Proportion
- A 95 confidence interval uses a margin of error
1.96(standard errors) - point estimate margin of error
21Finding the 95 Confidence Interval for a
Population Proportion
- The exact standard error of a sample proportion
equals - This formula depends on the unknown population
proportion, p - In practice, we dont know p, and we need to
estimate the standard error
22Finding the 95 Confidence Interval for a
Population Proportion
- In practice, we use an estimated standard error
23Finding the 95 Confidence Interval for a
Population Proportion
- A 95 confidence interval for a population
proportion p is
24Example Would You Pay Higher Prices to Protect
the Environment?
- In 2000, the GSS asked Are you willing to pay
much higher prices in order to protect the
environment? - Of n 1154 respondents, 518 were willing to do so
25Example Would You Pay Higher Prices to Protect
the Environment?
- Find and interpret a 95 confidence interval for
the population proportion of adult Americans
willing to do so at the time of the survey
26Example Would You Pay Higher Prices to Protect
the Environment?
27Sample Size Needed for Large-Sample Confidence
Interval for a Proportion
- For the 95 confidence interval for a proportion
p to be valid, you should have at least 15
successes and 15 failures
2895 Confidence
- With probability 0.95, a sample proportion value
occurs such that the confidence interval contains
the population proportion, p - With probability 0.05, the method produces a
confidence interval that misses p
29How Can We Use Confidence Levels Other than 95?
- In practice, the confidence level 0.95 is the
most common choice - But, some applications require greater confidence
- To increase the chance of a correct inference, we
use a larger confidence level, such as 0.99
30A 99 Confidence Interval for p
31Different Confidence Levels
32Different Confidence Levels
- In using confidence intervals, we must compromise
between the desired margin of error and the
desired confidence of a correct inference - As the desired confidence level increases, the
margin of error gets larger
33What is the Error Probability for the Confidence
Interval Method?
- The general formula for the confidence interval
for a population proportion is - Sample proportion (z-score)(std. error)
- which in symbols is
-
34What is the Error Probability for the Confidence
Interval Method?
35Summary Confidence Interval for a Population
Proportion, p
- A confidence interval for a population proportion
p is
36Summary Effects of Confidence Level and Sample
Size on Margin of Error
- The margin of error for a confidence interval
- Increases as the confidence level increases
- Decreases as the sample size increases
37What Does It Mean to Say that We Have 95
Confidence?
- If we used the 95 confidence interval method to
estimate many population proportions, then in the
long run about 95 of those intervals would give
correct results, containing the population
proportion
38A recent survey asked During the last year,
did anyone take something from you by force?
- Of 987 subjects, 17 answered yes
- Find the point estimate of the proportion of the
population who were victims - .17
- .017
- .0017
39 Section 7.3
- How Can We Construct a Confidence Interval To
Estimate a Population Mean?
40How to Construct a Confidence Interval for a
Population Mean
- Point estimate margin of error
- The sample mean is the point estimate of the
population mean - The exact standard error of the sample mean is s/
- In practice, we estimate s by the sample standard
deviation, s
41How to Construct a Confidence Interval for a
Population Mean
- For large n
- and also
- For small n from an underlying population that is
normal - The confidence interval for the population mean
is
42How to Construct a Confidence Interval for a
Population Mean
- In practice, we dont know the population
standard deviation - Substituting the sample standard deviation s for
s to get se s/ introduces extra error - To account for this increased error, we replace
the z-score by a slightly larger score, the
t-score
43How to Construct a Confidence Interval for a
Population Mean
- In practice, we estimate the standard error of
the sample mean by se s/ - Then, we multiply se by a t-score from the
t-distribution to get the margin of error for a
confidence interval for the population mean
44Properties of the t-distribution
- The t-distribution is bell shaped and symmetric
about 0 - The probabilities depend on the degrees of
freedom, df - The t-distribution has thicker tails and is more
spread out than the standard normal distribution
45t-Distribution
46Summary 95 Confidence Interval for a
Population Mean
- A 95 confidence interval for the population mean
µ is - To use this method, you need
- Data obtained by randomization
- An approximately normal population distribution
47Example eBay Auctions of Palm Handheld Computers
- Do you tend to get a higher, or a lower, price if
you give bidders the buy-it-now option?
48Example eBay Auctions of Palm Handheld Computers
- Consider some data from sales of the Palm M515
PDA (personal digital assistant) - During the first week of May 2003, 25 of these
handheld computers were auctioned off, 7 of which
had the buy-it-now option
49Example eBay Auctions of Palm Handheld Computers
- Buy-it-now option
- 235 225 225 240 250 250 210
- Bidding only
- 250 249 255 200 199 240 228 255
232 246 210 178 246 240 245 225
246 225 -
50Example eBay Auctions of Palm Handheld Computers
- Summary of selling prices for the two types of
auctions - buy_now N Mean StDev Minimum Q1
Median Q3 - no 18 231.61 21.94 178.00
221.25 240.00 246.75 yes 7
233.57 14.64 210.00 225.00 235.00
250.00 - buy_now Maximum
- no 255.00
- yes 250.00
51Example eBay Auctions of Palm Handheld Computers
52Example eBay Auctions of Palm Handheld Computers
- To construct a confidence interval using the
t-distribution, we must assume a random sample
from an approximately normal population of
selling prices
53Example eBay Auctions of Palm Handheld Computers
- Let µ denote the population mean for the
buy-it-now option - The estimate of µ is the sample mean
- x 233.57
- The sample standard deviation is
- s 14.64
54Example eBay Auctions of Palm Handheld Computers
- The 95 confidence interval for the buy-it-now
option is - which is 233.57 13.54 or (220.03, 247.11)
-
55Example eBay Auctions of Palm Handheld Computers
- The 95 confidence interval for the mean sales
price for the bidding only option is - (220.70, 242.52)
56Example eBay Auctions of Palm Handheld Computers
- Notice that the two intervals overlap a great
deal - Buy-it-now (220.03, 247.11)
- Bidding only (220.70, 242.52)
- There is not enough information for us to
conclude that one probability distribution
clearly has a higher mean than the other
57How Do We Find a t- Confidence Interval for Other
Confidence Levels?
- The 95 confidence interval uses t.025 since 95
of the probability falls between - t.025 and
t.025 - For 99 confidence, the error probability is 0.01
with 0.005 in each tail and the appropriate
t-score is t.005
58If the Population is Not Normal, is the Method
Robust?
- A basic assumption of the confidence interval
using the t-distribution is that the population
distribution is normal - Many variables have distributions that are far
from normal
59If the Population is Not Normal, is the Method
Robust?
- How problematic is it if we use the t- confidence
interval even if the population distribution is
not normal?
60If the Population is Not Normal, is the Method
Robust?
- For large random samples, its not problematic
- The Central Limit Theorem applies for large n,
the sampling distribution is bell-shaped even
when the population is not
61If the Population is Not Normal, is the Method
Robust?
- What about a confidence interval using the
t-distribution when n is small? - Even if the population distribution is not
normal, confidence intervals using t-scores
usually work quite well - We say the t-distribution is a robust method in
terms of the normality assumption
62Cases Where the t- Confidence Interval Does Not
Work
- With binary data
- With data that contain extreme outliers
63The Standard Normal Distribution is the
t-Distribution with df 8
64The 2002 GSS asked What do you think is the
ideal number of children in a family?
- The 497 females who responded had a median of 2,
mean of 3.02, and standard deviation of 1.81.
What is the point estimate of the population
mean? - 497
- 2
- 3.02
- 1.81
65 Section 7.4
- How Do We Choose the Sample Size for a Study?
66How are the Sample Sizes Determined in Polls?
- It depends on how much precision is needed as
measured by the margin of error - The smaller the margin of error, the larger the
sample size must be
67Choosing the Sample Size for Estimating a
Population Proportion?
- First, we must decide on the desired margin of
error - Second, we must choose the confidence level for
achieving that margin of error - In practice, 95 confidence intervals are most
common
68Example What Sample Size Do You Need For An
Exit Poll?
- A television network plans to predict the outcome
of an election between two candidates Levin and
Sanchez - They will do this with an exit poll that randomly
samples votes on election day
69Example What Sample Size Do You Need For An
Exit Poll?
- The final poll a week before election day
estimated Levin to be well ahead, 58 to 42 - So the outcome is not expected to be close
- The researchers decide to use a sample size for
which the margin of error is 0.04
70Example What Sample Size Do You Need For An
Exit Poll?
- What is the sample size for which a 95
confidence interval for the population proportion
has margin of error equal to 0.04?
71Example What Sample Size Do You Need For An
Exit Poll?
- The 95 confidence interval for a population
proportion p is - If the sample size is such that 1.96(se) 0.04,
then the margin of error will be 0.04
72Example What Sample Size Do You Need For An
Exit Poll?
- Find the value of the sample size n for which
0.04 1.96(se) -
73Example What Sample Size Do You Need For An
Exit Poll?
- A random sample of size n 585 should give a
margin of error of about 0.04 for a 95
confidence interval for the population proportion
74How Can We Select a Sample Size Without Guessing
a Value for the Sample Proportion
- In the formula for determining n, setting
- 0.50 gives the largest value for n
out of all the possible values to substitute for - Doing this is the safe approach that guarantees
well have enough data
75Sample Size for Estimating a Population Parameter
- The random sample size n for which a confidence
interval for a population proportion p has margin
of error m (such as m 0.04) is
76Sample Size for Estimating a Population Parameter
- The z-score is based on the confidence level,
such as z 1.96 for 95 confidence - You either guess the value youd get for the
sample proportion based on other information or
take the safe approach of setting 0.50
77Sample Size for Estimating a Population Mean
- The random sample size n for which a 95
confidence interval for a population mean has
margin of error approximately equal to m is - To use this formula, you guess the value youll
get for the sample standard deviation, s
78Sample Size for Estimating a Population Mean
- In practice, since you dont yet have the data,
you dont know the value of the sample standard
deviation, s - You must substitute an educated guess for s
- You can use the sample standard deviation from a
similar study
79Example Finding n to Estimate Mean Education in
South Africa
- A social scientist plans a study of adult South
Africans to investigate educational attainment in
the black community - How large a sample size is needed so that a 95
confidence interval for the mean number of years
of education has margin of error equal to 1 year?
80Example Finding n to Estimate Mean Education in
South Africa
- No prior information about the standard deviation
of educational attainment is available - We might guess that the sample education values
fall within a range of about 18 years
81Example Finding n to Estimate Mean Education in
South Africa
- If the data distribution is bell-shaped, the
range from 3 to 3 will contain
nearly all the distribution - The distance 3 to 3 equals 6s
- Solving 18 6s for s yields s 3
- So 3 is a crude estimate of s
82Example Finding n to Estimate Mean Education in
South Africa
- The desired margin of error is m 1 year
- The required sample size is
83What Factors Affect the Choice of the Sample Size?
- The first is the desired precision, as measured
by the margin of error, m - The second is the confidence level
84What Other Factors Affect the Choice of the
Sample Size?
- A third factor is the variability in the data
- If subjects have little variation (that is, s is
small), we need fewer data than if they have
substantial variation - A fourth factor is financial
- Cost is often a major constraint
85What if You Have to Use a Small n?
- The t- methods for a mean are valid for any n
- However, you need to be extra cautious to look
for extreme outliers or great departures from the
normal population assumption
86What if You Have to Use a Small n?
- In the case of the confidence interval for a
population proportion, the method works poorly
for small samples
87Constructing a Small-Sample Confidence Interval
for a Proportion
- Suppose a random sample does not have at least 15
successes and 15 failures - The confidence interval formula
-
- Is still valid if we use it after adding 2 to
the original number of successes and 2 to the
original number of failures - This results in adding 4 to the sample size n