Title: Do Now
1Do Now
- If I just sampled 1000 people and I found their
average body temperature to be 98.2F, what else
do I need to construct a confidence interval for
the average body temperature of all human beings?
2Things to Watch Out For When Working With Sample
Means
- Beware of observations that are not independent.
- The CLT depends crucially on the assumption of
independence. - You cant check this with your datayou have to
think about how the data were gathered. - Watch out for small samples from skewed
populations. - The more skewed the distribution, the larger the
sample size we need for the CLT to work.
3Confidence Intervals for Sample Means
- Just as we did before, we will base both our
confidence interval on the sampling distribution
model. - The Central Limit Theorem told us that the
sampling distribution model for means is Normal
with mean µ and standard deviation
4Confidence Intervals for Sample Means
- All we need is a random sample of quantitative
data. - And the true population standard deviation, s.
- Well, thats a problem
5Confidence Intervals for Sample Means
- Proportions have a link between the proportion
value and the standard deviation of the sample
proportion. - This is not the case with meansknowing the
sample mean tells us nothing about the standard
deviation of the sample proportion - Well do the best we can estimate the population
parameter s with the sample statistic s.
6Confidence Intervals for Sample Means
- We now have extra variation in our standard error
from s, the sample standard deviation (since the
sample standard deviation will change from sample
to sample). - If we dont account for this extra variation, we
will provide an incorrect confidence interval
(which depends on an accurate standard error)
7Confidence Intervals for Sample Means
- And, the shape of the sampling model changesthe
model is no longer Normal. So, what is the
sampling model? - William S. Gosset, an employee of the Guinness
Brewery in Dublin, Ireland, worked long and hard
to find out what the sampling model was.
8Gossets t
- The sampling model that Gosset found has been
known as Students t. - The Students t-models form a whole family of
related distributions that depend on a parameter
known as degrees of freedom. - We often denote degrees of freedom as df, and the
model as tdf.
9The Sampling Distribution Model for Means
- When the conditions are met, the standardized
sample mean -
-
- follows a Students t-model with n 1 degrees
of freedom. - We estimate the standard error with
10The Sampling Distribution Model for Means
- When Gosset corrected the model for the extra
uncertainty, the margin of error got bigger. - Your confidence intervals will be just a bit
wider and your P-values just a bit larger than
they were with the Normal model. - By using the t-model, youve compensated for the
extra variability in precisely the right way.
11The Sampling Distribution Model for Means
- Students t-models are unimodal, symmetric, and
bell shaped, just like the Normal. - But t-models with only a few degrees of freedom
have much fatter tails than the Normal. - http//www.stat.tamu.edu/jhardin/applets/signed/T
.html
12The Sampling Distribution Model for Means
- As the degrees of freedom increase, the t-models
look more and more like the Normal (when n equals
30, its hard to tell the two apart). - In fact, the t-model with infinite degrees of
freedom is exactly Normal.
13Finding t-Values By Hand
- The Students t-model is different for each value
of degrees of freedom. - Because of this, Statistics books usually have
one table of t-model critical values for selected
confidence levels.
14Finding t-Values By Hand (cont.)
- What if we dont have the degrees of freedom in
our t-table? - Option 1 Interpolate the value using the two
surrounding values. (Example find the critical
t-value for 38 d.f. that cuts off the top and
bottom 5) - Option 2 Just use the STAT -gt TESTS -gt TInterval
to generate the confidence interval. Just make
sure that you always write the formula for the
C.I. and plug in as much as you can. - You can take the confidence interval, divide it
by 2, and divide it by the S.E.
15One-Sample t-Interval
- When the conditions are met, we are ready to find
the confidence interval for the population mean,
µ. - The confidence interval is
- where the standard error of the mean is
- The critical value depends on the particular
confidence level, C, that you specify and on the
number of degrees of freedom, n 1, which we get
from the sample size.
16Finding t-Values By Hand (cont.)
- Do example on p.449 using the calculator to find
the t-interval. - Try the 1-proportion z-interval for the last
problem on the test. - Remember, its okay to use these as long as you
write out the formula and plug the numbers in.
17Assumptions and Conditions
- Gosset found the t-model by simulation.
- Years later, when Sir Ronald A. Fisher showed
mathematically that Gosset was right, he needed
to make some assumptions to make the proof work. - We will use these assumptions when working with
Students t.
18Assumptions and Conditions (cont.)
- Independence Assumption
- Randomization Condition The data arise from a
random sample or suitably randomized experiment.
Randomly sampled data (particularly from an SRS)
are ideal. - 10 Condition When a sample is drawn without
replacement, the sample should be no more than
10 of the population.
19Assumptions and Conditions (cont.)
- Normal Population Assumption
- We can never be certain that the data are from a
population that follows a Normal model, but we
can check the - Nearly Normal Condition The data come from a
distribution that is unimodal and symmetric. - Check this condition by making a histogram or
Normal probability plot.
20Make a Picture, Make a Picture, Make a Picture
- Pictures tell us far more about our data set than
a list of the data ever could. - The only reasonable way to check the Nearly
Normal Condition is with graphs of the data. - Make a histogram of the data and verify that its
distribution is unimodal and symmetric with no
outliers. - You may also want to make a Normal probability
plot to see that its reasonably straight.
21Assumptions and Conditions (cont.)
- Nearly Normal Condition
- The smaller the sample size (n lt 15 or so), the
more closely the data should follow a Normal
model. - For moderate sample sizes (n between 15 and 40 or
so), the t works well as long as the data are
unimodal and reasonably symmetric. - For larger sample sizes, the t methods are safe
to use even if the data are skewed.
22Sample Size
- To find the sample size needed for a particular
confidence level with a particular margin of
error (ME), solve this equation for n - The problem with using the equation above is that
we dont know most of the values (and t is a
function of the sample size!) We can overcome
this - We can use s from a small pilot study.
- We can use z in place of the necessary t value.
23Sample Size
- Example Say you want a M.E. of 0.5F and you
think the standard deviation of body temps is
0.75F . What sample size should you use? - First, use a critical z-score to find what sample
size would create that margin of error (with a
95 confidence). - Now, use that sample size to determine the d.f.
for the critical t value. - Last, calculate the sample size required using
the critical t.
24Homework
- Add the following problems to p. 462-464 10, 11,
23 p. 323 33, p. 343 26, p. 362-365 2, 4, 6,
8, 13, 17, 22, 29, 31, 33 p. 378-381 1, 3, 4,
8, 10, 12, 14, 18, 25 (This will be due on Monday
and worth 15 points) - Correct tests for half the points back as per the
test correction policies (due Monday) - Make sure you have given your AP exam payment to
Mrs. Cohan in the Guidance Office by Friday!