Title: Statistics 303
1Statistics 303
- Chapter 7
- Inference for Means
2 Inference for Means
- To this point, when examining the mean of a
population we have always assumed that the
population standard deviation (s) was known. - In practice this is seldom the case.
- We usually must estimate the population standard
deviation with the sample standard deviation s
(for a review of s, see pp. 49-50 of the book). - When we do this, the sampling distribution of the
sample mean is no longer normally distributed,
because of the adjustment for estimating s with
s. - Thus, instead of using the Z, the standard normal
distribution, we must use the appropriate
t-distribution.
3Inference for Means
- The t-distribution
- Although there is only one Z-distribution, there
are many, many t-distributions. - In fact, there is a different t-distribution for
each sample size used. - The shape of each t-distribution is very similar
to the Z-distribution, but is slightly flatter. - The larger the sample size, the closer the
t-distribution is to the Z-distribution.
4Inference for Means
- The t-distribution
- The way we distinguish between various
t-distributions is by finding the degrees of
freedom (df) that correspond to the sample size. - When we are looking at only one sample, the
degrees of freedom are the sample size minus one
df n 1. - We say that the one-sample t-statistic
has the t distribution with n 1 degrees of
freedom.
5Inference for Means
- The t-distribution
- A table of t distribution critical values can be
found in Table D (the last page of the book). - Note that these values are areas to the right,
not areas to the left as in the Z-table. - In Table D, the degrees of freedom are listed in
the left column. - The probabilities are on top (these probabilities
are inside for the Z-table) - The individual t-values are inside the table.
- Make sure to get acquainted with this table and
how it differs from the Z-table.
6Inference for Means
- The t-distribution
- In the book, p.452, we see an example of how the
distributions compare
7Inference for Means
- The t-distribution
- With the change from s to s, and the change from
z to t, the steps in producing confidence
intervals and hypothesis tests are the same as we
have seen previously. - In Chapter 1, p. 50, we find that s is calculated
from the data using the formula
This formula is very cumbersome. Ideally, a
computer is used to calculate s, particularly for
large data sets.
8Confidence Interval for m with Unknown s
- The formula for a confidence interval for m with
unknown s is
9Confidence Interval for m with Unknown s
- Confidence Interval Example
- An economist wants to determine the average
amount a family of four in the United States
spends on housing annually. He randomly selects
85 families of size four and finds the amount
they spent on housing the previous year. - The economist wishes to estimate the mean with
99 confidence.
10Confidence Interval for m with Unknown s
- Confidence Interval Example
- Information given
Sample size n 85.
Data 6,789, 8,233, 4,784, , 5,974 (85
numbers)
df n 1 85 1 84
11Confidence Interval for m with Unknown s
- Confidence Interval Example
This is a 99 confidence interval for the true
average amount a family of four in the United
States spends on housing annually.
12Hypothesis Test for m with Unknown s
- The steps for a hypothesis test are the same as
those seen previously, namely, - 1. State the null hypothesis.
- 2. State the alternative hypothesis.
- 3. State the level of significance (i.e., a
0.05). - 4. Calculate the test statistic (note change)
13Hypothesis Test for m with Unknown s
- 5. Find the P-value
- For a two-sided test
- For a one-sided test
- Ha uu0
- For a one-sided test
- Ha u
Because of the limited number of t-values given
in Table D, it is more common to find a range for
the P-value, rather than the exact value (as will
be seen in the example). Computers can be used
to obtain exact values.
14Hypothesis Test for m with Unknown s
- 6. Reject or fail to reject H0 based on the
P-value. - If the P-value is less than or equal to a, reject
H0. - It the P-value is greater than a, fail to reject
H0. - 7. State your conclusion.
- If H0 is rejected, There is significant
statistical evidence that the population mean is
different than m0. - If H0 is not rejected, There is not significant
statistical evidence that the population mean is
different than m0.
Notice that these last two steps are exactly the
same as for the case where s is known.
15Hypothesis Test for m with Unknown s
- T.V. Example
- Suppose that the data collected from our class
survey is a random sample from the entire
university (which it obviously is not). We wish
to see if there is evidence that the average
amount of television watched for students here is
more than 7 hours per week.
16Hypothesis Test for m with Unknown s
- T.V. Example
- Information given
Sample size n 38.
17Hypothesis Test for m with Unknown s
- T.V. Example
- 1. State the null hypothesis
- 2. State the alternative hypothesis
- 3. State the level of significance
from is more than
Assume a 0.05
18Hypothesis Test for m with Unknown s
- T.V. Example
- 4. Calculate the test statistic.
- 5. Find the P-value.
Remember the table gives probabilities to the
right so we do not use the technique of
subtracting from 1.
Use df 30 (rounding down)
19Hypothesis Test for m with Unknown s
- T.V. Example
- 6. Do we reject or fail to reject H0 based on the
P-value? - 7. State the conclusion.
P-value between 0.15 and 0.20 is greater than a
0.05.
Therefore, we fail to reject H0
There is not significant statistical evidence
that the average amount of television watched is
more than 7 hours per week at the 0.05 level of
significance.
20Matched Pairs t-test
- To this point we have only looked at tests for
single samples. - Soon we will look at confidence intervals and
hypothesis tests for comparing two groups. - When each individual can be given both
treatments, we can reduce the two samples to a
single sample using a matched pairs design. - Examples
- Students are each given a pre-test and a
post-test to determine the amount of material
learned in a given time interval. - To examine the effect of a new drug, a large
group of identical twins is identified. One twin
is given a treatment and the other a placebo. - A ophthalmologist is examining the importance of
the dominant eye in reading. A large group of
subjects is asked to read a passage with dominant
eye covered and again with the non-dominant eye
covered. - It can be seen in each of these examples that
something pairs the two responses.
21Matched Pairs t-test
- To analyze matched pairs data, we first reduce
the data from two samples to one sample and then
analyze the data using one-sample techniques. - The data is reduced from two samples to one by
subtracting one of the responses from the other. - We could subtract each pre-test score from each
post-test score. - We could subtract each placebo response from each
treatment response. - We could subtract the time taken to read the
passage with the non-dominant eye from the time
taken to read the passage with the dominant eye.
22Matched Pairs t-test
- Example Keyboards
- Suppose we want to compare two brands of
computer keyboards, which we will denote as
keyboard 1 and keyboard 2. Keyboard 1 is a
standard keyboard, while keyboard 2 is specially
designed so that the keys need very little
pressure to make them respond. The manufacturer
of keyboard 2 would like to claim that typing can
be done faster using keyboard 2A simple random
sample of n 30 teachers was selected from a
population of high-school teachers attending a
national conference. Each teacher typed the same
page of text once using keyboard 1 and once using
keyboard 2. For each teacher the order in which
the keyboards were used was determined by the
toss of a coin. For each teacher the variable
measured was the time (in seconds) to correctly
type the page of text (from Graybill, Iyer and
Burdick, Applied Statistics, 1998).
23Matched Pairs t-test
Reduction to one sample
- Example Keyboards
- Information given
Sample size n 30.
24Matched Pairs t-test
- Example Keyboards
- 1. State the null hypothesis
- 2. State the alternative hypothesis
- 3. State the level of significance
from carefully reading
Assume a 0.05
25Matched Pairs t-test
- Example Keyboards
- 4. Calculate the test statistic.
- 5. Find the P-value.
Remember the table gives probabilities to the
right.
26Matched Pairs t-test
- Example Keyboards
- 6. Do we reject or fail to reject H0 based on the
P-value? - 7. State the conclusion.
P-value between 0.01 and 0.02 is less than a
0.05.
Therefore, we reject H0
There is significant statistical evidence that
the average amount of time needed to type the
passage is lower for keyboard 2 than keyboard 1
at the 0.05 level of significance.
27Matched Pairs Confidence Interval
- After reducing the data to a single sample, we
use the same formula as for a confidence interval
for m with unknown s, namely,
using the mean and standard deviation of the
differences.
28Matched Pairs Confidence Interval
- Example Golf Balls
- In the manufacture of golf balls two procedures
are used. Method I utilizes a liquid center and
method II, a solid center. To compare the
distance obtained using both types of balls, 12
golfers are allowed to drive a ball of each type,
and the length of the drive (in yards) is
measured. (from Milton, McTeer, and Corbet,
Introduction to Statistics, 1997) - The manufacturer wants to estimate the mean
difference with 90 confidence.
29Matched Pairs Confidence Interval
- Example Golf Balls
- Information given
Sample size n 12.
df n 1 12 1 11
30Matched Pairs Confidence Interval
This is a 90 confidence interval for the true
average difference for the distance traveled for
the two types of golf balls.
31Comparing Two Means
- We use the same basic principles for comparing
two population means as those used for examining
one population mean. - If the standard deviations s1 and s2 for each of
the two populations are known, the two-sample
z-statistic is then
But it is very rare that both population standard
deviations are known. We will examine the
situation in which they are not known.
32Comparing Two Means
- When we are interested in comparing two
population means and we are estimating the
population standard deviations s1 and s2 with s1
and s2, the two-sample t-statistic is then
with degrees of freedom equal to the smaller of
n1-1 and n2-1 (or an appropriate estimate using
computer software).
33Comparing Two Means
- The null hypothesis can be any of the following
- The alternative hypothesis can be any of the
following (depending on the question being
asked)
The other steps are the same as those used for
the tests we have looked at previously.
34Comparing Two Means
- Example Tomatoes
- There has been some discussion among amateur
gardeners about the virtues of black plastic
versus newspapers as weed inhibitors for growing
tomatoes. To compare the two, several rows of
tomatoes are planted. Black plastic is used
around nine randomly selected plants and
newspaper around the remaining ten. All plants
start at virtually the same height and receive
the same care. The response of interest is the
height in feet after a months growth. (from
Milton, McTeer, and Corbet, Introduction to
Statistics, 1997). - Perform a test to see if there is any difference
between the average heights with significance
level 0.10.
35Comparing Two Means
- Example Tomatoes
- Information given
Sample sizes n1 9, n2 10.
36Comparing Two Means
- Example Tomatoes
- 1. State the null hypothesis
- 2. State the alternative hypothesis
- 3. State the level of significance
from any difference between
a 0.10
37Comparing Two Means
- Example Tomatoes
- 4. Calculate the test statistic.
- 5. Find the P-value.
38Comparing Two Means
- Example Tomatoes
- 6. Do we reject or fail to reject H0 based on the
P-value? - 7. State the conclusion.
P-value between 0.10 and 0.20 is greater than a
0.10.
Therefore, we fail to reject H0
There is not significant statistical evidence
that the average tomato plant heights are
different for the two types of weed inhibitors at
the 0.10 level of significance.
39Comparing Two Means
- The confidence interval for the difference of two
population means (m1- m2) is
Where t comes from Table D and corresponds to
the confidence level desired and df smaller of
n1-1 and n2-1 .
40Comparing Two Means
- Example Commercials
- There is some concern that TV commercial breaks
are becoming longer. The observations on the
following slide are obtained on the length in
minutes of commercial breaks for the 1984 viewing
season and the current season. (from Milton,
McTeer, and Corbet, Introduction to Statistics,
1997) - Find a 95 confidence interval for the difference
between the true averages of the two seasons.
41Comparing Two Means
- Example Commercials
- Information given
Sample sizes n1 16, n2 16.
42Comparing Two Means
This is a 95 confidence interval for the true
difference of average length in minutes for
commercials between 1984 and the present.
43Pooled t test Comparing Two Means
- The null hypothesis can be any of the following
- The alternative hypothesis can be any of the
following (depending on the question being
asked)
44Pooled Estimator
- Previously, we discussed two-sample t procedures
from two populations with two unknown standard
deviations. We then used the sample standard
deviations to estimate the population standard
deviations. But what about when the two
populations have the same standard deviation.
This estimate is called the pooled estimator of
s2 because it combines the information in both
samples.
45Test Statistic
- Suppose that an SRS of size n1 is drawn from a
normal population with unknown mean µ1 and that
an independent SRS of size n2 is drawn from
another normal population with unknown mean µ2.
Suppose also that the two populations have the
SAME standard deviation. Thus, the two-sample t
statistic is - With degrees of freedom equal to n1 n2 2
46Confidence Interval
- A level C confidence interval for µ1 µ2 is
- Where t comes from Table D and corresponds to
the confidence level desired and df n1 n2 2
47Comparing Two MeansPooled t Procedures
- Example Tomatoes
- Information given
Sample sizes n1 9, n2 10.
48Comparing Two MeansPooled t Procedures
- Example Tomatoes
- 1. State the null hypothesis
- 2. State the alternative hypothesis
- 3. State the level of significance
from any difference between
a 0.10
49Comparing Two MeansPooled t Procedures
- Example Tomatoes
- 4. Calculate the test statistic.
-
with df17
- 5. Find the P-value.
50Comparing Two MeansPooled t Procedures
- Example Tomatoes
- 6. Do we reject or fail to reject H0 based on the
P-value? - 7. State the conclusion.
P-value (between 0.1 and 0.2) is greater than a
0.10.
Therefore, we fail to reject H0
There is not significant statistical evidence
that the average tomato plant heights are
different for the two types of weed inhibitors at
the 0.10 level of significance.
51Comparing Two MeansPooled t Procedures
- Example Tomatoes
- Compute a 99 confidence interval for the
difference between the true means, given the
standard deviations are equal.