Comparing more than two means Analysis of Variance - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Comparing more than two means Analysis of Variance

Description:

Using data from an alumni survey, the starting salaries (in thousands of dollars) ... Check reasonableness by showing the largest sample standard deviation is no more ... – PowerPoint PPT presentation

Number of Views:104
Avg rating:3.0/5.0
Slides: 18
Provided by: dyanaha
Category:

less

Transcript and Presenter's Notes

Title: Comparing more than two means Analysis of Variance


1
Comparing more than two means(Analysis of
Variance)
  • Chapter 22

2
Example
  • Using data from an alumni survey, the starting
    salaries (in thousands of dollars) and the area
    of specialization of recent graduates of
    Tulipville University are studied.
  • Areas of specialty include computer science,
    business, liberal arts, and education.
  • Does it appear that starting salary varies
    according to area of specialty?

3
Data
4
Side by Side Box Plots

5
Sample Size
  • There are between three to seven data points in
    each box plot.
  • Suppose instead there are 100 data points in each
    box plot. In this case, would the observed
    variation be more or less significant?

6
Comparing variability.
  • The box plots on right are box plots of our data.
    The box plots on left are box plots for another
    university.
  • Assuming the sample sizes are the same, in which
    graph are differences between the mean more
    significant?

7
Comparing means.
  • To determine if differences in means are
    significant, one must
  • Consider what the sample sizes are.
  • Consider what the variation is between data in
    the same sample (underlying variation) and
    compare this to the variation between sample
    means.
  • Analysis of variance (ANOVA) is a procedure which
    does this. The main idea is to compare
    underlying variation to variation between sample
    means. Furthermore, sample size is taken into
    account.

8
Hypotheses
  • Analysis of variance is an overall test. It
    tests the hypothesis
  • Null Hypothesis.
  • Alternate Hypothesis.
  • If we reject the null, we will do follow up
    analysis.

9
Notation
  • N
  • I
  • Sample means.
  • For each group.
  • Overall
  • Sample standard deviations.
  • For each group.

10
Measuring variation.
  • MSE measures the variation between data that come
    form the same sample.
  • MSG measures the variation between sample means.

11
The test statistic
  • The two types of variation are compared by taking
    their ratio.
  • What is F statistic for our example?
  • Sketch of F distribution. The larger F is, the
    more significant the variation between sample
    means is. (P-value is area to the right.)

12
Degrees of freedom
  • The f statistic has two degrees of freedom.
  • Numerator degrees of freedom are I-1.
  • Denominator degrees of freedom are N-I.
  • What are the degrees of freedom in our example?

13
P-value / Conclusion.
  • Use table D to estimate P-value.
  • What is conclusion?

14
Minitab output
Analysis of Variance Source DF SS
MS F P Factor 3 508.99
169.66 17.37 0.000 Error 16 156.32
9.77 Total 19 665.31
Individual 95 CIs For Mean
Based on Pooled
StDev Level N Mean StDev
--------------------------------- Computer
6 33.633 3.583
(---------) Business 7 34.586 2.827
(--------) Liberal
3 23.567 3.550 (--------------) Early
Ch 4 23.625 2.514 (-----------)

--------------------------------- Pooled
StDev 3.126 20.0 25.0
30.0 35.0
15
Follow up analysis.
  • Using the original box plots and/or the
    confidence intervals in the Minitab output, what
    do the variations from the null hypothesis appear
    to be?

16
Assumptions for ANOVA
  • SRS are taken from each population. These SRS
    are independent (like in two sample t).
  • Each population has a normal distribution. The
    test is robust against this assumption since what
    really matters is that the sampling distribution
    of the means be normal. If there are no outliers
    and the distributions are roughly symmetric you
    can safely use ANOVA with sample sizes as small
    as 4 or 5. 
  • All of the populations have the same standard
    deviation s, whose value is unknown. Check
    reasonableness by showing the largest sample
    standard deviation is no more than two times the
    smallest sample standard deviation.

17
Summary
  • ANOVA is an overall test used to test the null
    that all the populations means are equal against
    the alternative that not all of the means are
    equal.
  • Idea is to compare variation bewteen means (MSG)
    to variation of data within the same sample
    (MSE). The test statistic is the ratio fF
    MSG/MSE. (You will not need to compute by hand,
    but should be able to interpret the various
    pieces of Minitab output.)
  • The larger the F test statistic, the smaller the
    P-value. If the P-value is small and the null
    hypothesis is rejected, follow up analysis should
    be done.
  • One should always check the assumptions of a
    statistical test.
Write a Comment
User Comments (0)
About PowerShow.com