Statistics 303 - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Statistics 303

Description:

... examine the effect of a new drug, a large group of identical twins is identified. ... ( from Milton, McTeer, and Corbet, Introduction to Statistics, 1997) ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 52
Provided by: erich2
Category:

less

Transcript and Presenter's Notes

Title: Statistics 303


1
Statistics 303
  • Chapter 7
  • Inference for Means

2
Inference for Means
  • To this point, when examining the mean of a
    population we have always assumed that the
    population standard deviation (s) was known.
  • In practice this is seldom the case.
  • We usually must estimate the population standard
    deviation with the sample standard deviation s
    (for a review of s, see pp. 49-50 of the book).
  • When we do this, the sampling distribution of the
    sample mean is no longer normally distributed,
    because of the adjustment for estimating s with
    s.
  • Thus, instead of using the Z, the standard normal
    distribution, we must use the appropriate
    t-distribution.

3
Inference for Means
  • The t-distribution
  • Although there is only one Z-distribution, there
    are many, many t-distributions.
  • In fact, there is a different t-distribution for
    each sample size used.
  • The shape of each t-distribution is very similar
    to the Z-distribution, but is slightly flatter.
  • The larger the sample size, the closer the
    t-distribution is to the Z-distribution.

4
Inference for Means
  • The t-distribution
  • The way we distinguish between various
    t-distributions is by finding the degrees of
    freedom (df) that correspond to the sample size.
  • When we are looking at only one sample, the
    degrees of freedom are the sample size minus one
    df n 1.
  • We say that the one-sample t-statistic

has the t distribution with n 1 degrees of
freedom.
5
Inference for Means
  • The t-distribution
  • A table of t distribution critical values can be
    found in Table D (the last page of the book).
  • Note that these values are areas to the right,
    not areas to the left as in the Z-table.
  • In Table D, the degrees of freedom are listed in
    the left column.
  • The probabilities are on top (these probabilities
    are inside for the Z-table)
  • The individual t-values are inside the table.
  • Make sure to get acquainted with this table and
    how it differs from the Z-table.

6
Inference for Means
  • The t-distribution
  • In the book, p.452, we see an example of how the
    distributions compare

7
Inference for Means
  • The t-distribution
  • With the change from s to s, and the change from
    z to t, the steps in producing confidence
    intervals and hypothesis tests are the same as we
    have seen previously.
  • In Chapter 1, p. 50, we find that s is calculated
    from the data using the formula

This formula is very cumbersome. Ideally, a
computer is used to calculate s, particularly for
large data sets.
8
Confidence Interval for m with Unknown s
  • The formula for a confidence interval for m with
    unknown s is

9
Confidence Interval for m with Unknown s
  • Confidence Interval Example
  • An economist wants to determine the average
    amount a family of four in the United States
    spends on housing annually. He randomly selects
    85 families of size four and finds the amount
    they spent on housing the previous year.
  • The economist wishes to estimate the mean with
    99 confidence.

10
Confidence Interval for m with Unknown s
  • Confidence Interval Example
  • Information given

Sample size n 85.
Data 6,789, 8,233, 4,784, , 5,974 (85
numbers)
df n 1 85 1 84
11
Confidence Interval for m with Unknown s
  • Confidence Interval Example

This is a 99 confidence interval for the true
average amount a family of four in the United
States spends on housing annually.
12
Hypothesis Test for m with Unknown s
  • The steps for a hypothesis test are the same as
    those seen previously, namely,
  • 1. State the null hypothesis.
  • 2. State the alternative hypothesis.
  • 3. State the level of significance (i.e., a
    0.05).
  • 4. Calculate the test statistic (note change)

13
Hypothesis Test for m with Unknown s
  • 5. Find the P-value
  • For a two-sided test
  • For a one-sided test
  • Ha uu0
  • For a one-sided test
  • Ha u

Because of the limited number of t-values given
in Table D, it is more common to find a range for
the P-value, rather than the exact value (as will
be seen in the example). Computers can be used
to obtain exact values.
14
Hypothesis Test for m with Unknown s
  • 6. Reject or fail to reject H0 based on the
    P-value.
  • If the P-value is less than or equal to a, reject
    H0.
  • It the P-value is greater than a, fail to reject
    H0.
  • 7. State your conclusion.
  • If H0 is rejected, There is significant
    statistical evidence that the population mean is
    different than m0.
  • If H0 is not rejected, There is not significant
    statistical evidence that the population mean is
    different than m0.

Notice that these last two steps are exactly the
same as for the case where s is known.
15
Hypothesis Test for m with Unknown s
  • T.V. Example
  • Suppose that the data collected from our class
    survey is a random sample from the entire
    university (which it obviously is not). We wish
    to see if there is evidence that the average
    amount of television watched for students here is
    more than 7 hours per week.

16
Hypothesis Test for m with Unknown s
  • T.V. Example
  • Information given

Sample size n 38.
17
Hypothesis Test for m with Unknown s
  • T.V. Example
  • 1. State the null hypothesis
  • 2. State the alternative hypothesis
  • 3. State the level of significance

from is more than
Assume a 0.05
18
Hypothesis Test for m with Unknown s
  • T.V. Example
  • 4. Calculate the test statistic.
  • 5. Find the P-value.

Remember the table gives probabilities to the
right so we do not use the technique of
subtracting from 1.
Use df 30 (rounding down)
19
Hypothesis Test for m with Unknown s
  • T.V. Example
  • 6. Do we reject or fail to reject H0 based on the
    P-value?
  • 7. State the conclusion.

P-value between 0.15 and 0.20 is greater than a
0.05.
Therefore, we fail to reject H0
There is not significant statistical evidence
that the average amount of television watched is
more than 7 hours per week at the 0.05 level of
significance.
20
Matched Pairs t-test
  • To this point we have only looked at tests for
    single samples.
  • Soon we will look at confidence intervals and
    hypothesis tests for comparing two groups.
  • When each individual can be given both
    treatments, we can reduce the two samples to a
    single sample using a matched pairs design.
  • Examples
  • Students are each given a pre-test and a
    post-test to determine the amount of material
    learned in a given time interval.
  • To examine the effect of a new drug, a large
    group of identical twins is identified. One twin
    is given a treatment and the other a placebo.
  • A ophthalmologist is examining the importance of
    the dominant eye in reading. A large group of
    subjects is asked to read a passage with dominant
    eye covered and again with the non-dominant eye
    covered.
  • It can be seen in each of these examples that
    something pairs the two responses.

21
Matched Pairs t-test
  • To analyze matched pairs data, we first reduce
    the data from two samples to one sample and then
    analyze the data using one-sample techniques.
  • The data is reduced from two samples to one by
    subtracting one of the responses from the other.
  • We could subtract each pre-test score from each
    post-test score.
  • We could subtract each placebo response from each
    treatment response.
  • We could subtract the time taken to read the
    passage with the non-dominant eye from the time
    taken to read the passage with the dominant eye.

22
Matched Pairs t-test
  • Example Keyboards
  • Suppose we want to compare two brands of
    computer keyboards, which we will denote as
    keyboard 1 and keyboard 2. Keyboard 1 is a
    standard keyboard, while keyboard 2 is specially
    designed so that the keys need very little
    pressure to make them respond. The manufacturer
    of keyboard 2 would like to claim that typing can
    be done faster using keyboard 2A simple random
    sample of n 30 teachers was selected from a
    population of high-school teachers attending a
    national conference. Each teacher typed the same
    page of text once using keyboard 1 and once using
    keyboard 2. For each teacher the order in which
    the keyboards were used was determined by the
    toss of a coin. For each teacher the variable
    measured was the time (in seconds) to correctly
    type the page of text (from Graybill, Iyer and
    Burdick, Applied Statistics, 1998).

23
Matched Pairs t-test
Reduction to one sample
  • Example Keyboards
  • Information given

Sample size n 30.
24
Matched Pairs t-test
  • Example Keyboards
  • 1. State the null hypothesis
  • 2. State the alternative hypothesis
  • 3. State the level of significance

from carefully reading
Assume a 0.05
25
Matched Pairs t-test
  • Example Keyboards
  • 4. Calculate the test statistic.
  • 5. Find the P-value.

Remember the table gives probabilities to the
right.
26
Matched Pairs t-test
  • Example Keyboards
  • 6. Do we reject or fail to reject H0 based on the
    P-value?
  • 7. State the conclusion.

P-value between 0.01 and 0.02 is less than a
0.05.
Therefore, we reject H0
There is significant statistical evidence that
the average amount of time needed to type the
passage is lower for keyboard 2 than keyboard 1
at the 0.05 level of significance.
27
Matched Pairs Confidence Interval
  • After reducing the data to a single sample, we
    use the same formula as for a confidence interval
    for m with unknown s, namely,

using the mean and standard deviation of the
differences.
28
Matched Pairs Confidence Interval
  • Example Golf Balls
  • In the manufacture of golf balls two procedures
    are used. Method I utilizes a liquid center and
    method II, a solid center. To compare the
    distance obtained using both types of balls, 12
    golfers are allowed to drive a ball of each type,
    and the length of the drive (in yards) is
    measured. (from Milton, McTeer, and Corbet,
    Introduction to Statistics, 1997)
  • The manufacturer wants to estimate the mean
    difference with 90 confidence.

29
Matched Pairs Confidence Interval
  • Example Golf Balls
  • Information given

Sample size n 12.
df n 1 12 1 11
30
Matched Pairs Confidence Interval
  • Example Golf Balls

This is a 90 confidence interval for the true
average difference for the distance traveled for
the two types of golf balls.
31
Comparing Two Means
  • We use the same basic principles for comparing
    two population means as those used for examining
    one population mean.
  • If the standard deviations s1 and s2 for each of
    the two populations are known, the two-sample
    z-statistic is then

But it is very rare that both population standard
deviations are known. We will examine the
situation in which they are not known.
32
Comparing Two Means
  • When we are interested in comparing two
    population means and we are estimating the
    population standard deviations s1 and s2 with s1
    and s2, the two-sample t-statistic is then

with degrees of freedom equal to the smaller of
n1-1 and n2-1 (or an appropriate estimate using
computer software).
33
Comparing Two Means
  • The null hypothesis can be any of the following
  • The alternative hypothesis can be any of the
    following (depending on the question being
    asked)

The other steps are the same as those used for
the tests we have looked at previously.
34
Comparing Two Means
  • Example Tomatoes
  • There has been some discussion among amateur
    gardeners about the virtues of black plastic
    versus newspapers as weed inhibitors for growing
    tomatoes. To compare the two, several rows of
    tomatoes are planted. Black plastic is used
    around nine randomly selected plants and
    newspaper around the remaining ten. All plants
    start at virtually the same height and receive
    the same care. The response of interest is the
    height in feet after a months growth. (from
    Milton, McTeer, and Corbet, Introduction to
    Statistics, 1997).
  • Perform a test to see if there is any difference
    between the average heights with significance
    level 0.10.

35
Comparing Two Means
  • Example Tomatoes
  • Information given

Sample sizes n1 9, n2 10.
36
Comparing Two Means
  • Example Tomatoes
  • 1. State the null hypothesis
  • 2. State the alternative hypothesis
  • 3. State the level of significance

from any difference between
a 0.10
37
Comparing Two Means
  • Example Tomatoes
  • 4. Calculate the test statistic.
  • 5. Find the P-value.

38
Comparing Two Means
  • Example Tomatoes
  • 6. Do we reject or fail to reject H0 based on the
    P-value?
  • 7. State the conclusion.

P-value between 0.10 and 0.20 is greater than a
0.10.
Therefore, we fail to reject H0
There is not significant statistical evidence
that the average tomato plant heights are
different for the two types of weed inhibitors at
the 0.10 level of significance.
39
Comparing Two Means
  • The confidence interval for the difference of two
    population means (m1- m2) is

Where t comes from Table D and corresponds to
the confidence level desired and df smaller of
n1-1 and n2-1 .
40
Comparing Two Means
  • Example Commercials
  • There is some concern that TV commercial breaks
    are becoming longer. The observations on the
    following slide are obtained on the length in
    minutes of commercial breaks for the 1984 viewing
    season and the current season. (from Milton,
    McTeer, and Corbet, Introduction to Statistics,
    1997)
  • Find a 95 confidence interval for the difference
    between the true averages of the two seasons.

41
Comparing Two Means
  • Example Commercials
  • Information given

Sample sizes n1 16, n2 16.
42
Comparing Two Means
  • Example Commercials

This is a 95 confidence interval for the true
difference of average length in minutes for
commercials between 1984 and the present.
43
Pooled t test Comparing Two Means
  • The null hypothesis can be any of the following
  • The alternative hypothesis can be any of the
    following (depending on the question being
    asked)

44
Pooled Estimator
  • Previously, we discussed two-sample t procedures
    from two populations with two unknown standard
    deviations. We then used the sample standard
    deviations to estimate the population standard
    deviations. But what about when the two
    populations have the same standard deviation.
    This estimate is called the pooled estimator of
    s2 because it combines the information in both
    samples.

45
Test Statistic
  • Suppose that an SRS of size n1 is drawn from a
    normal population with unknown mean µ1 and that
    an independent SRS of size n2 is drawn from
    another normal population with unknown mean µ2.
    Suppose also that the two populations have the
    SAME standard deviation. Thus, the two-sample t
    statistic is
  • With degrees of freedom equal to n1 n2 2

46
Confidence Interval
  • A level C confidence interval for µ1 µ2 is
  • Where t comes from Table D and corresponds to
    the confidence level desired and df n1 n2 2

47
Comparing Two MeansPooled t Procedures
  • Example Tomatoes
  • Information given

Sample sizes n1 9, n2 10.
48
Comparing Two MeansPooled t Procedures
  • Example Tomatoes
  • 1. State the null hypothesis
  • 2. State the alternative hypothesis
  • 3. State the level of significance

from any difference between
a 0.10
49
Comparing Two MeansPooled t Procedures
  • Example Tomatoes
  • 4. Calculate the test statistic.

  • with df17
  • 5. Find the P-value.

50
Comparing Two MeansPooled t Procedures
  • Example Tomatoes
  • 6. Do we reject or fail to reject H0 based on the
    P-value?
  • 7. State the conclusion.

P-value (between 0.1 and 0.2) is greater than a
0.10.
Therefore, we fail to reject H0
There is not significant statistical evidence
that the average tomato plant heights are
different for the two types of weed inhibitors at
the 0.10 level of significance.
51
Comparing Two MeansPooled t Procedures
  • Example Tomatoes
  • Compute a 99 confidence interval for the
    difference between the true means, given the
    standard deviations are equal.
Write a Comment
User Comments (0)
About PowerShow.com