Title: Estimating a Population Mean
1SECTION 10.2
- Estimating a Population Mean
2Whats the difference between what we did in
Section 10.1 and what we are beginning in Section
10.2?
- In reality, the standard deviation s of the
population is unknown, so the procedures from
last section are not useful. However, the
understanding of the logic of the procedures will
continue to be of use. - In order to be more realistic, s is estimated
from the data collected using s
3Conditions for Inference about a Population Mean
- The data is an SRS from the population
- Observations from the population have a normal
distribution with an unknown mean (?) and unknown
standard deviation (s) - Independence is assumed for the individual
observations when calculating a confidence
interval. When we are sampling without
replacement from a finite population, it is
sufficient to verify that the population is at
least 10 times the sample size.
4CAUTION
- Be sure to check that the conditions for
constructing a confidence interval for the
population mean are satisfied before you perform
any calculations.
5ROBUSTNESS
- ROBUST Confidence levels do not change when
certain assumptions are violated - Fortunately for us, the t-procedures are robust
in certain situations. - Therefore . . .
6This is when we use the t-procedures
- Its more important for the data to be
- an SRS from a population than the population has
a normal distribution - If n is less than 15, the data must be normal to
use t-procedures - If n is at least 15, the t-procedures can be used
except if there are outliers or strong skewness - If n30, t-procedures can be used even in the
- presence of strong skewness, but outliers must
still be examined - Essentially, as long as there are no significant
departures from Normality (especially outliers)
then the t procedures still work quite well.
7Standard Error
- In this setting, each sample is a part of a
sampling distribution that is a normal
distribution with a mean equal to the
populations mean - Since we do not know s, we will replace the
standard deviation formula of with this
formula - This is called the standard error of the
sample mean
8Degrees of Freedom
- Commonly listed as df
- Equal to n-1
- When a t-distribution has k degrees of freedom,
we will write this as t(k) - When the actual df does not appear in Table C,
use the greatest df available that is less than
your desired df - This guarantees a wider confidence interval than
needed to justify a given confidence level
9Density Curves for t Distributions
- Bell-shaped and symmetric
- Greater spread than a normal curve
- As degrees of freedom (or sample size) increases,
the t density curves appear more like a normal
curve
10Confidence Intervals
- t
- t is the upper (1-C)/2 critical value for the
t(n-1) distribution - We find t using the table or our calculator
- tinvT(area to left of t, df)
- We interpret these the same way we did in the
last chapter. - This interval is exactly correct when the
population distribution is Normal and is
approximately correct for large n in other cases.
11INFERENCE TOOLBOX (p 631)
DO YOU REMEMBER WHAT THE STEPS ARE???
Steps for constructing a CONFIDENCE INTERVAL
- 1PARAMETERIdentify the population of interest
and the parameter you want to draw a conclusion
about. - 2CONDITIONSChoose the appropriate inference
procedure. VERIFY conditions (SRS, Normality,
Independence) before using it. - 3CALCULATIONSIf the conditions are met, carry
out the inference procedure. - 4INTERPRETATIONInterpret your results in the
context of the problem. CONCLUSION, CONNECTION,
CONTEXT(meaning that our conclusion about the
parameter connects to our work in part 3 and
includes appropriate context)
12Example GOT MILK?
A milk processor monitors the number of bacteria
per milliliter in raw milk received for
processing. A random sample of 10 one-milliliter
specimens from milk supplied by one producer give
the following data 5370, 4890, 5100, 4500,
5260, 5150, 4900, 4760, 4700, 4870 Construct a
90 confidence interval.
- --We want to estimate ? the mean number of
bacteria per milliliter in all of the milk from
this supplier - --Since we dont know s, we should construct a
one-sample t interval for ?. - We must be confident that the data are an SRS
from the producers milk. We must learn how the
sample was chosen to see if it can be regarded as
an SRS (we are only told that it is a random
sample). - A boxplot and a Normal probability plot of the
data show no outliers and no strong skewness.
This gives us little reason to doubt the
Normality of the population from which this
sample was drawn. In practice, we would probably
rely on the fact that past measurements of this
type have been roughly Normal. - Since these measurements came from a random
sample of specimens, they should be independent
(assuming that there were many, at least 100,
one-milliliter specimens available at the milk
processing facility).
13Example GOT MILK? Cont.
- --Entering these data into a calculator gives
- 4950 and s268.45. So a 90 confidence
interval for the mean bacteria count per
milliliter in this producers milk is -
- --We can say that we are 90 confident that the
actual mean number of bacteria per milliliter of
milk from this supplier is between 4794.4 and
5105.6 because we used a method that yields
intervals such that 90 of all these intervals
will capture the true mean desired.
df 10-1 9
14Paired t Procedures
- Recall, matched pairs studies are a form of block
design in which just two treatments are being
compared - Also, experiments are rarely done on randomly
selected subjects. Random selection allows us to
generalize results to a larger population, but
random assignment of treatments to subjects
allows us to compare treatments. - Be careful to distinguish a matched pairs setting
from a two-sample setting. - The real key is independence.
- TREAT THE DIFFERENCES from a matched pairs study
as a single sample.
15TECHNOLOGY
- As always, you will be allowed unrestricted use
of your calculator on quizzes and tests (as well
as the actual AP Exam). For this reason, ALWAYS
be certain to write down the values of key
numbers that are being used (means, standard
deviations, degrees of freedom, significance
levels, etc.) along with results of the
calculator procedures in order to receive full
credit. - The calculator information is available in your
book on pages 661-662. - We are now using the T Interval instead of the Z
Interval - Plug in exactly what you are asked for