Title: Confidence Intervals
1MATH 1530 Elements of StatisticsDr. Kirsten Boyd
- Chapter 8
- Confidence Intervals
Slides adapted from Ms Smyth, Dr. Griffy and the
Weiss Text
2(No Transcript)
3Sec. 8.1 Estimating a Population Mean
- If we want to use a statistic ( or s) to
estimate a parameter (µ or s), can you be 100
confident that your sample accurately represents
the population?
4Estimating a Population Mean
point estimate for µ s point estimate
for s
5How Confident Can We Be?
- Can we assign a level of confidence to our point
estimate? - How certain are we that our statistic is a good
estimator of the parameter? - What is the probability that our sample
accurately represents the population?
6Confidence-Interval
7Building a 95.44 CI
- 95.44 of data in a normal distribution are
within two standard deviations of the mean - Sampling distribution will be normal if original
data is normal or if n is large - Therefore, if original data is normal or if n is
large, 95.44 of samples of size n will have a
mean that is within two standard deviations of
the population mean
8Building a 95.44 CI
- Recall
- Construct the interval consisting of all numbers
that are within 2 standard deviations of the
sample mean, as follows - left endpoint of interval is x - 2(s / vn)
- right endpoint of interval is x 2(s / vn)
-
- NOTE For this process, you must be given the
sample size, the sample mean and the population
standard deviation
995.44 CI
- If you take all possible sample means and their
CIs, 95.44 of these CIs would contain µ - Thus, there is a 95.44 chance that the CI
contains µ
10Mobile Home Prices of 20 Random Samples(computer
simulation, see page 341)
11Problems 8.4 and 8.6 (page 342)
- The following list shows the number of baby
snakes per litter, for 44 randomly-chosen litters
of eastern cottonmouths. - 5, 12, 7, 7, 6, 8, 12, 9, 7, 4, 9, 6, 12, 7, 5,
6, 10, 3, 10, 8, 8, 12, 5, 6, 10, 11, 3, 8, 4, 5,
7, 6, 11, 7, 6, 8, 8, 14, 8, 7, 11, 7, 5, 4 - 4.a. Use the data to obtain a point estimate
for µ, the mean number of snakes per litter born
to all eastern cottonmouths. - 4.b. Is your answer to Part a. likely to equal
µ exactly? - 6. Assume that s, the population st. dev. for
the number of snakes per litter, is 2.4. - 6.a. Obtain an approx. 95.44 CI for the mean
number of snakes per litter born to all eastern
cottonmouths. - 6.b. Interpret your answer to 6.a.
- 6.c. Why is the 95.44 CI that you obtained in
Part a. not necessarily exact?
12Sec. 8.2 Confidence Intervals for One Population
Mean When s Is Known
- We dont want to be limited to just 95.44 CI!
- If the sampling distribution is normal, we can
compute CI with any confidence level using
z-scores. - Recall from Sec. 6.2
- za means the z-score that has area a to its right
(under the standard normal curve). - Beware Table II and calculator both use areas to
left. - example
- z 0.025 invNorm(1-0.025) 1.96
- Or, use Table II to find z-score that has area
0.975 (because 0.9751-0.025) to its left.
13Confidence Intervals When s is Known
95.44 of all samples have means within
of µ. An interval for any confidence
level can be calculated replacing 2 with za/2 ,
where a is obtained by subtracting the confidence
level from 1. example for 95.44 confidence, a
1-0.9544 0.0456 But now we can do this for
any confidence level.
14In Step 1, can also use calculator to find za/2
, using invNorm(1-a/2)
15Finding Alpha, a
- Lower case Greek letter alpha, a
- a 1 (level of confidence)
- Must write level of confidence as a number
between 0 and 1, NOT as a percentage) - Example for a 97 confidence interval,
- a 1 - 0.97 0.03
16Example Step 1
- For a level of confidence 95
- Determine a
- Determine a/2
- Determine za/2
- For a level of confidence 99
- Determine a
- Determine a/2
- Determine za/2
17Example Step 2
- Assume that for a certain sample of size 24, the
mean of the x-values is 12. Suppose that we know
that s 2.3. - Part a. Find the C.I. for a 95 confidence
level. - Part b. Find the C.I. for a 99 confidence level.
18Guidelines for Using z-interval CI
- Use only when
- n lt 15, data is normal
- 15 n 30, data is approximately normal, or
- n 30
19Confidence Precision
- For a fixed sample size,
- lower CL better precision
- higher CL worse precision
20East German Prisoners PTSD
- Page 351, 8.35 (Apply procedure on page 345 to
get a 95 CI) - x number of months imprisoned
- Assume that s 42. For a sample of 32 patients
with chronic PTSD, the mean duration of
imprisonment was 33.4 months. - Step 1 Find za/2
- Step 2 Find CI
- Step 3 Interpret the CI
218.35 by Calculator
Using the calculator, you still must interpret
your results!
- Stat lt Tests lt 7 zInterval
- Highlight Stats lt Enter
- Fill in s, x, n, and C-Level (not )
- Highlight Calculate lt Enter
22Sec. 8.3 Margin of Error
- The margin of error is the part of the formula
that is being added to and subtracted from x in
calculating the CI.
23Meaning of margin of error
- The margin of error is the wiggle room around the
sample mean - If CI is from 3 (which is 5-2) to 7 (which is
52), then the wiggle room is 2 - The margin of error is 2
24Visual Interpretation of Margin of Error
Width of CI CI max CI min 2E
25Precision
High Precision is the same thing as the CI
being short, so its the same as E being small
(since the length of the CI is 2E).
26Sample Size
27Prisoners PTSD
- Page 358, 8.65, continuation of 8.35 (Slide 20)
- x33.4, n32, s42
- Part a. Determine E, the margin of error.
- Part b. Explain what E means in terms of the
accuracy of the estimate (in this context). - Part c. Find the sample size needed to obtain a
margin of error of 12 months and a 99 CL. - Part d. Find a 99 CI for µ (the mean duration
of imprisonment) if it is known that a certain
sample of the size determined in Part c. has mean
36.2 months.
28Sec. 8.4 Confidence Intervals for One Population
Mean When s Is Unknown
The studentized version of x is called t and is
like the standardized version z (shown at right),
except that it uses s (the sample st. dev.) in
place of s (the population st. dev.) So, use t
if you have only sample data and dont know s.
29Distribution of t is not normal, but the larger n
is, the closer it is to normal, because then s is
closer to s
30Key Fact
31t-scores (Table IV or calculator)
- Find the t-value that is larger than 95 of all
t-values, for a sample size of 14. (Answer is
t0.05 1.771)
On calculator 2nd gt VARS to get DISTR, then
option 4, invT(area to left, df) For example,
t0.05 invT(1-0.05,13)1.771
32(No Transcript)
33When can we use the t-interval procedure?
- Only when x has a normal distribution, which
happens when - nlt 15, data is normal
- 15 n 30, data is approximately normal, or
- n 30
- If we have a sample of size less than 30 and we
want to figure out whether we can use the
t-interval procedure, we can use a normal
probability plot (Sec. 6.4) to assess whether the
data is normal.
34Example 8.9, page 363Dollar amounts lost in 25
pickpocket crimes
n 25, not a large enough sample to
automatically make the distribution of x normal.
Need to check whether x is approximately normal.
t-Interval Procedure is appropriate
35Example 8.10, page 364 Amount of chicken
consumed last year, by 17 randomly-chosen people
Sometimes a variable is normal only if you remove
outlier(s), which can be done only if theres a
good reason!
t-Interval Procedure is not appropriate
t-Interval Procedure is appropriate (sample size
is now 16)
36Page 368
8.82 For a t-curve with df8, find each t-value
and illustrate your results with a graph. Part
a. The t-value that has area 0.05 to its
right. Part b. t0.10 Part c. The t-value that
has area 0.01 to its left. (Hint if using
table a t-curve is symmetric about 0). Part d.
The two t-values that divide the area under the
curve into a middle area of 0.95 and two outer
areas of 0.025.
37Page 369
8.94 25 families of four that visited amusement
parks were selected. The cost of each visit
(rounded to the nearest dollar) is recorded
below. 156, 212, 218, 189, 172, 221, 175, 208,
152, 184, 209, 195, 207, 179, 181, 202, 166, 213,
221, 237, 130, 217, 161, 208, 220 (Note Using
the 1-Var Stats calculator function, you can
find the sample mean x193.32 and the sample st.
dev. s26.73.) Obtain and interpret a 95
confidence interval for the mean cost for a
family of 4 to spend the day at an amusement
park. Answer theres a 95 chance that the
average cost for a family of 4 to spend the day
at an amusement park is between 182.30 and
204.40.
388.94 (previous slide), calculator method
- Enter data in a list (Stat lt Edit)
- Stat lt Tests lt 8TInterval
- Highlight Data, press Enter
- List name of list where you put the data
- Freq1
- C-Level 0.95
- Highlight Calculate, press Enter
If you use the calculator, you still must be able
to interpret your results!