Title: One-Way Analysis of Variance
1One-Way Analysis of Variance
2 Recapitulation1. Comparing differences
among three or more subsamples requires a
different statistical test than either
z-tests or t-tests.2. The solution is to
perform an analysis of variance (ANOVA).3.
ANOVA involves the comparison of two estimates
for the population variance.4. One variance
estimate captures only the random
differences among sampled units, the other these
random differences plus the effects of being
in the different subsamples.5. The ratio
between the two estimated variances is
evaluated using the F-statistic sampling
distributions.
3 Recapitulation (continued)6. ANOVA is based
on the general linear model. 7. The general
linear model is Yij ? ?jXij
?ij where Xij is the subgroup difference and
?j is a constant estimating its effect on
Yij. 8. When subgroup differences do not exist,
?j 0.0. 9. The null hypothesis is H0
?1 ?2 ?3 . . . ?j
4As an example, consider an experiment on worker
productivity in an introductory psychology class.
Thirty students were randomly selected for the
experiment from PSYCH 100 and randomly assigned
to one of three subgroups. The productivity
measure (Yij) was the number of puzzles that
these students solved in a fixed period of time.
The three experimental conditions (treatments,
Xij) were left alone to solve puzzles solving
puzzles in the presence of the other nine group
members (so that each subject could observe her
or his own rate of puzzle solving) and solving
puzzles in the presence of other subjects AND in
the presence of a monitor, meant to simulate a
supervisor. The results look like this
5Not
Monitored Alone Not Monitored Together
MonitoredSubject Yi,1 Subject
Yi,2 Subject Yi,3
1 13 1 9 1
8 2 14 2 11 2 6 3
10 3 10 3 9 4
11 4 8 4 7 5 12 5
10 5 8 6 10 6
12 6 10 7 12 7
11 7 8 8 12 8
10 8 9 9 13 9 9 9 610
11 10 10 10
11 N1
10 N2 10 N3 10 ?1
118 ?2 100 ?3 82 _
_ _ Y1 11.8 Y2 10.0 Y3
8.2_Y 10.0
6Our hypothesis (H1) is that working conditions
affect worker performance in ways that we do not
fully understand H1 ?1 ? ?2 ? ?3 Our null
hypothesis (H0) is that worker performance is
unaffected by working conditions H0 ?1
?2 ?3Since a comparison of THREE subgroup
means is required, t-tests are inappropriate.
The approach known generically as the analysis of
variance must be used.
7Not
Monitored Alone Not Monitored Together
MonitoredSubject Yi,1 Subject
Yi,2 Subject Yi,3
1 13 1 9 1
8 2 14 2 11 2 6 3
10 3 10 3 9 4
11 4 8 4 7 5 12 5
10 5 8 6 10 6
12 6 10 7 12 7
11 7 8 8 12 8
10 8 9 9 13 9 9 9 610
11 10 10 10
11 N1
10 N2 10 N3 10 ?1
118 ?2 100 ?3 82 _
_ _ Y1 11.8 Y2 10.0 Y3
8.2_Y 10.0
8First we calculate the total sum of
squaresWe begin with the first score in the
first group and continue through the 30th score
in the third group, as followsSSTotal (13 -
10.0)2 (14 - 10.0)2 ... (11 - 10.0)2
(9 - 10.0)2 ... (10 - 10.0)2 (8 - 10.0)2
... (11 - 10.0)2 116
9Next we calculate the sum of squares between, as
followsFor the first of the three subgroups,
we find the difference between the group mean and
the grand mean, square that difference then
multiply it by the size of the subgroup, then do
the same for the other two subgroups. Then we
sum these three products, as followsSSBetween
10(11.8 - 10.0)2 10(10.0 - 10.0)2 10(8.2
- 10.0)2 64.8
10Not
Monitored Alone Not Monitored Together
MonitoredSubject Yi,1 Subject
Yi,2 Subject Yi,3
1 13 1 9 1
8 2 14 2 11 2 6 3
10 3 10 3 9 4
11 4 8 4 7 5 12 5
10 5 8 6 10 6
12 6 10 7 12 7
11 7 8 8 12 8
10 8 9 9 13 9 9 9 610
11 10 10 10
11 N1
10 N2 10 N3 10 ?1
118 ?2 100 ?3 82 _
_ _ Y1 11.8 Y2 10.0 Y3
8.2_Y 10.0
11Finally, we calculate the sum of squares
withinThis means that we find the squared
difference between each of the ten scores in the
first group and the mean for the first group,
then the squared difference between the ten
scores in the second group and the mean for the
SECOND group, then the squared difference between
the ten scores in the third group and the mean
for the THIRD group, and finally add all 30
squared differences togetherSSWithin (13 -
11.8)2 (14 - 11.8)2 ... (11 - 11.8)2
(9 - 10.0)2 ... (10 - 10.0)2 (8 - 8.2)2
... (11 - 8.2)2 51.2
12To check our calculations, remember the
identity TotalSS BetweenSS WithinSS
116 64.8 51.2Next, we need the
degrees of freedom. Total degrees of freedom is
simply number of cases less one, N - 1. Here,
there are 30 cases, so there are 29 total degrees
of freedom. For degrees of freedom between, the
three subgroup means are treated as scores, so
there are J - 1 across subgroups, here 3 - 1,
giving us 2 degrees of freedom between. Finally,
we lose a degree of freedom by partitioning into
subgroups, i.e., N - J. Here we have three
subgroups, so we lose a degree of freedom for
each giving us 30 - 3 or 27 degrees of freedom
within.
13Analysis of variance results by convention are
reported in what is called an "ANOVA summary
table"So
urce SS df Mean Square
F Between
64.80 2 32.40
17.05Groups Within 51.20
27 1.90Groups Total
116.00 29
14We perform a significance test in the usual way,
first by selecting alpha, then locating the
appropriate sampling distribution, finding the
critical value, and comparing this value to the
value of the F-statistic. With alpha 0.05, we
find Appendix 3, p. 544. In this example we have
2 and 27 degrees of freedom. The table of
critical values has degrees of freedom between as
COLUMN headings (n1) and degrees of freedom
within as ROW headings (n2). In column 2, row
27 we find the critical value to be 3.35. Since
our F-value is 17.05, GREATER than 3.35, we know
that it lies well inside the region of rejection,
hence we REJECT the null hypothesis at the 0.05
level. Substantively, this means that we infer
that the conditions under which one performs a
task DO have an effect on performance.
15(No Transcript)
16The F-test is a significance test, an inferential
statistic. It tells us only whether or not
exposure to the treatment variable has measurable
consequences that are different from chance. It
does NOT tell us about the strength of
association between the treatment (Xij) and the
dependent variable, Yij. For this we need a
measure of association.The sum of squares
BETWEEN represents the variance attributable to
the treatment variable, Xij. The TOTAL sum of
squares expresses the total amount of variance in
the dependent variable, Yij, that is, the total
variance "to be explained" statistically. A
ratio of the two is a straightforward description
of the percentage of variance in Yij accounted
for by its association with Xij. Statistically
this is called R-square.
17From the example above, the sum of squares
between is 64.80 and the total sum of squares is
116.00. Thus, R-square isThe F-test tells
us that treatment categories (working conditions)
differ in ways that cannot be explained as
chance. R-square tells us that 56 percent of the
variation in task performance is associated with
differences in working conditions.
18Knowing that the treatment variable has a
statistically significant effect does not tell us
WHICH specific treatment category or categories
have greater impact than others. In our example,
we know only that AT LEAST ONE of the
puzzle-solving conditions differs from one (or
both) of the remaining two, but we do not know
which. In other words, we do not know which of
the following alternative hypotheses is (are)
supported ?1 ? ?2 ?3 ?1 ?2 ?
?3 ?1 ? ?3 ?2 ?1 ? ?2 ? ?3 We
need a way to statistically compare the subgroups.
19There are two strategies comparisons explicitly
planned in advance are called a priori tests
those performed after an initial ANOVA are called
post hoc comparison tests. Of the latter, we
will use only the method known as the Scheffé
test.The Scheffé method creates a threshold
for comparing subgroup means (once an ANOVA null
hypothesis has been rejected) called the minimum
significant difference. Differences between two
subgroup means that exceed this minimum
significant difference are statistically
significant that is, their difference appears to
be real rather than due to chance.The
algorithm is in Sirkin (1999), p. 333.
20where _ _ Yj and Yj1 are the
subsample means being compared dfBetween
is degrees of freedom between in the
ANOVA F? is the critical value of F at the
significance level (?) chosen for the
comparison MSWithin is the ANOVA mean square
within and nj and nj1 are the sizes two
subsamples being compared
21In the puzzle-solving example, _ _
_ Y1 11.8, Y2 10.0, and Y3 8.2
dfBetween 2 F? 2.51 (? .10, df 2, 27)
MSWithin 1.90 and n1 n2 n2
10Hence,
22- The value 1.381 is the minimum significant
difference, - the threshold we use to compare subsample means
with - ? set at 0.10. Sirkin (1999) contains no 0.10 F
table. - Here is how Sirkin would organize our comparison
tests - _ _
- H0 Yj Yj1 Critical
Value Conclusion -
- ?1 ?2 11.8 10.0 1.80 gt 1.381
Reject H0 - ?2 ?3 10.0 8.2 1.80 gt 1.381
Reject H0 - ?1 ?3 11.8 8.2 3.60 gt 1.381
Reject H0
23 Sample SAS Program Puzzle-Solving
Example libname old 'a\'libname library
'a\' options nodate nonumber ps66 proc
glm dataold.exampleclass settingmodel
puzzles settingmeans setting / scheffe alpha
0.1contrast 'Alone vs. Together' setting 1
-1 0contrast 'Alone vs. Monitor' setting 1
0 -1contrast 'Together vs. Monitor' setting 0
1 -1contrast 'Alone vs. Others' setting 2
-1 -1contrast 'Together vs. Others' setting -1
2 -1title1 'ANOVA With Comparison Tests'run
24 ANOVA With Comparison Tests
General Linear Models Procedure
Class Level Information
Class Levels Values
SETTING 3 (1) alone (2)
monitor (3) together
Number of observations in data set
30
25 ANOVA With Comparison Tests
General Linear Models Procedure
Dependent Variable PUZZLES
Sum of MeanSource
DF Squares Square
F Value Pr gt F Model 2
64.80000000 32.40000000 17.09
0.0001 Error 27
51.20000000 1.89629630 Corrected Total
29 116.00000000
R-Square C.V. Root MSE
PUZZLES Mean 0.558621
13.77061 1.3770607
10.000000 Source DF
Type I SS Mean Square F Value Pr gt
F SETTING 2 64.80000000
32.40000000 17.09 0.0001 Source
DF Type III SS Mean Square
F Value Pr gt F SETTING 2
64.80000000 32.40000000 17.09
0.0001
26 ANOVA With Comparison Tests
General Linear Models
Procedure Scheffe's
test for variable PUZZLES NOTE This
test controls the type I experimentwise error
rate but generally has a higher
type II error rate than REGWF for all
pairwise comparisons
Alpha 0.1 df 27 MSE 1.896296
Critical Value of F 2.51061
Minimum Significant Difference
1.38 Means with the same letter are
not significantly different.
Scheffe Grouping Mean N
SETTING A
11.8000 10 alone
B 10.0000 10 together
C 8.2000
10 monitor
27 ANOVA With Comparison Tests
General Linear Models
Procedure Scheffe's
test for variable PUZZLES NOTE This
test controls the type I experimentwise error
rate but generally has a higher
type II error rate than REGWF for all
pairwise comparisons
Alpha 0.1 df 27 MSE 1.896296
Critical Value of F 2.51061
Minimum Significant Difference
1.38 Means with the same letter are
not significantly different.
Scheffe Grouping Mean N
SETTING A
11.8000 10 alone
B 10.0000 10 together
B
B 8.2000 10 monitor
28 ANOVA With Comparison Tests
General Linear Models
Procedure Scheffe's
test for variable PUZZLES NOTE This
test controls the type I experimentwise error
rate but generally has a higher
type II error rate than REGWF for all
pairwise comparisons
Alpha 0.1 df 27 MSE 1.896296
Critical Value of F 2.51061
Minimum Significant Difference
1.38 Means with the same letter are
not significantly different.
Scheffe Grouping Mean N
SETTING A
11.8000 10 alone
A A
10.0000 10 together
A A
8.2000 10 monitor
29 ANOVA With Comparison Tests
General Linear Models Procedure
Dependent Variable PUZZLES Contrast
DF Contrast SS Mean Square F Value
Pr gt F Alone vs. Together 1
64.80000000 64.80000000 34.17
0.0001Alone vs. Monitor 1
16.20000000 16.20000000 8.54
0.0069Together vs. Monitor 1
16.20000000 16.20000000 8.54
0.0069Alone vs. Others 1
48.60000000 48.60000000 25.63
0.0001Together vs. Others 1
48.60000000 48.60000000 25.63
0.0001
30(No Transcript)
31 One-Way Analysis of Variance
Exercise Four groups of randomly selected and
randomly assigned students were taught a basic
course in statistics by four different methods.
A standardized test was given at the end of the
semester to all four groups. Evaluate the
differences in teaching approaches using the
Analysis of Variance. Assume that a 0.05, and
use the F distribution (Appendix 3, p.
544). Group 1 Group 2 Group 3 Group 4
20 15 22 19 22 18 21 23
21 20 24 20 20 18 25 18
19 19 24 15 1. Expressed
symbolically, what is the null hypothesis?
______________2. What is the value of the
sum of squares between? ______________3.
What is the value of the sum of squares within?
______________ 4. How many degrees of
freedom between? ______________ 5. How
many degrees of freedom within?
______________ 6. What is the value of the
mean square between? ______________ 7.
What is the value of the mean square within?
______________ 8. What is the value of the
F-ratio? ______________ 9. What is the
critical value of F? ______________ 10.
Do you reject the null hypothesis?
______________
32 One-Way Analysis of Variance Exercise
Answers Four groups of randomly selected and
randomly assigned students were taught a basic
course in statistics by four different methods.
A standardized test was given at the end of the
semester to all four groups. Evaluate the
differences in teaching approaches using the
Analysis of Variance. Assume that a 0.05, and
use the F distribution (Appendix 3, p.
544). Group 1 Group 2 Group 3 Group 4
20 15 22 19 22 18 21 23
21 20 24 20 20 18 25 18
19 19 24 15 1. Expressed
symbolically, what is the null hypothesis?
?1?2?3?4 2. What is the value of the
sum of squares between? 76.55 3. What
is the value of the sum of squares within?
64.00 4. How many degrees of freedom
between? 3 5. How many degrees of
freedom within? 16 6. What is the
value of the mean square between? 25.517
7. What is the value of the mean square
within? 4.000 8. What is the value of
the F-ratio? 6.379 9. What is the
critical value of F? 3.24 10. Do you
reject the null hypothesis? Yes, Reject