Title: ANOVA Procedures
1ANOVA Procedures
- ANOVA (1 way)
- An extension of the 2 sample t-test used to
determine if there are differences among gt 2
group means - ANOVA with trend test
- Test for polynomial trend in the group means
- ANOVA (2 way)
- Evaluate the combined effect of 2 experimental
factors - ANOVA repeated measures
- Extension of paired t-test for same or related
subjects over time or in differing circumstances - ANCOVA
- 1 way ANOVA in which group means are adjusted by
a covariate
21 way ANOVA Assumptions
- Independent Samples
- Normality within each group
- Equal variances
- Within group variances are the same for each of
the groups - Becomes less important if your sample sizes are
similar among your groups - Ho µ1 µ2 µ3 µk
- Ha the population means of at least 2 groups
are different
3What is ANOVA?
- One-Way ANOVA allows us to compare the means of
two or more groups (the independent variable) on
one dependent variable to determine if the group
means differ significantly from one another. - In order to use ANOVA, we must have a categorical
(or nominal) variable that has at least two
independent groups (e.g. nationality, grade
level) as the independent variable and a
continuous variable (e.g., IQ score) as the
dependent variable. - At this point, you may be thinking that ANOVA
sounds very similar to what you learned about t
tests. This is actually true if we are only
comparing two groups. But when were looking at
three or more groups, ANOVA is much more
effective in determining significant group
differences. The next slide explains why this is
true.
4Why use ANOVA vs a t-test
- ANOVA preserves the significance level
- If you took 4 independent samples from the same
population and made all possible comparisons
using t-tests (6 total comparisons) at .05 alpha
level there is a 1-(0.95)6 0.26 probability
that at least 1/6 of the comparisons will result
in a significant difference - gt 25 chance we reject Ho when it is true
5Homogeneity of Variance
- It is important to determine whether there are
roughly an equal number of cases in each group
and whether the amount of variance within each
group is roughly equal - The ideal situation in ANOVA is to have roughly
equal sample sizes in each group and a roughly
equal amount of variation (e.g., the standard
deviation) in each group - If the sample sizes and the standard deviations
are quite different in the various groups, there
is a problem
6t Tests vs. ANOVA
- As you may recall, t tests allow us to decide
whether the observed difference between the means
of two groups is large enough not to be due to
chance (i.e., statistically significant). - However, each time we reach a conclusion about
statistical significance with t tests, there is a
slight chance that we may be wrong (i.e., make a
Type I errorsee Chapter 7). So the more t tests
we run, the greater the chances become of
deciding that a t test is significant (i.e., that
the means being compared are really different)
when it really is not. - This is why one-way ANOVA is important. ANOVA
takes into account the number of groups being
compared, and provides us with more certainty in
concluding significance when we are looking at
three or more groups. Rather than finding a
simple difference between two means as in a t
test, in ANOVA we are finding the average
difference between means of multiple independent
groups using the squared value of the difference
between the means.
7How does ANOVA work?
- The question that we can address using ANOVA is
this Is the average amount of difference, or
variation, between the scores of members of
different samples large or small compared to the
average amount of variation within each sample,
otherwise known as random error? - To answer this question, we have to determine
three things. - First, we have to calculate the average amount of
variation within each of our samples. This is
called the mean square within (MSw) or the mean
square error (MSe). - This is essentially the same as the standard
error that we use in t tests. - Second, we have to find the average amount of
variation between the group means. This is
called the mean square between (MSb). - This is essentially the same as the numerator in
the independent samples t test. - Third Now that weve found these two statistics,
we must find their ratio by dividing the mean
square between by the mean square error. This
ratio provides our F value. - This is our old formula of dividing the statistic
of interest (i.e., the average difference between
the group means) by the standard error. - When we have our F value we can look at our
family of F distributions to see if the
differences between the groups are statistically
significant
8Calculating the SSe and the SSb
- The sum of squares error (SSe) represents the sum
of the squared deviations between individual
scores and their respective group means on the
dependent variable. - To find the SSe we
F MSb/MSe
- The SSb represents the sum of the squared
deviations between group means and the grand mean
(the mean of all individual scores in all the
groups combined on the dependent variable) - To find the SSb we
- The only real differences between the formula for
calculating the SSe and the SSb are - 1. In the SSe we subtract the group mean from
the individual scores in each group, whereas in
the SSb we subtract the grand mean from each
group mean. - 2. In the SSb we multiply each squared deviation
by the number of cases in each group. We must do
this to get an approximate deviation between the
group mean and the grand mean for each case in
every group.
9Finding The MSe and MSb
- To find the MSe and the MSb, we have to find the
sum of squares error (SSe) and the sum squares
between (SSb). - The SSe represents the sum of the squared
deviations between individual scores and their
respective group means on the dependent variable.
- The SSb represents the sum of the squared
deviations between group means and the grand mean
(the mean of all individual scores in all the
groups combined on the dependent variable
represented by the symbol Xt ).
MSe SSe / (N-K) Where KThe number of
groups NThe number of cases in all the group
combined.
- Once we have calculated the SSb and the
SSe, we can convert these numbers into average
squared deviation scores (our MSb and MSe). - To do this, we need to divide our SS scores by
the appropriate degrees of freedom. - Because we are looking at scores between
groups, our df for MSb is K-1. - And because we are looking at individual
scores, our df for MSe is N-K.
MSb SSb / (K-1) Where K The number of groups
10Calculating an F -Value
- Once weve found our MSe and MSb, calculating our
F Value is simple. We simply divide one value by
the other. - F MSb/MSe
- After an observed F value (Fo) is determined, you
need to check in Appendix C for to find the
critical F value (Fc). Appendix C provides a
chart that lists critical values for F associated
with different alpha levels. - Using the two degrees of freedom you have already
determined, you can see if your observed value
(Fo) is larger than the critical value (Fc). If
Fo is larger, than the value is statistically
significant and you can conclude that the
difference between group means is large enough to
not be due to chance.
11Example of Dividing the Variance between an
Individual Score and the Grand Mean into
Within-Group and Between-Group Components
12Example
- Suppose I want to know whether certain species of
animals differ in the number of tricks they are
able to learn. I select a random sample of 15
tigers, 15 rabbits, and 15 pigeons for a total of
45 research participants. - After training and testing the animals, I collect
the data presented in the chart to the right.
Animal Average Number of Tricks Learned
Tiger 12.3
Rabbit 3.1
Pigeon 9.7
SSe 110.6 SSb 37.3
13Example (continued)
- Step 1 Degrees of Freedom
- Now that you have your SSe and your SSb the
next step is to determine your two df values.
Remember, our df for MSb is K-1 and our df for
MSe is N-K - df for (MSb) 3-1 2
- df for (MSe) 45-3 42
- Step 2 Critical F-Value
- Using these two df values, look in Appendix
C to determine your critical F value. Based on
the F-Value equation, our numerator is 2 and our
denominator is 42. With an alpha value of .05,
our critical F value is 3.22. Therefore, if our
observed F value is larger than 3.22 we can
conclude there is a significant difference
between our three groups - Step 3 Calcuating MSb and MSe
- Using our formulas from the previous
slideMSe SSe / (N-K) AND MSb SSb / (K-1) - MSe 110.6/42 2.63
- MSb 37.3/2 18.65
- Step 4 Calculating an Observed F-Value
- Finally, we can calculate our Fo using the
formula Fo MSb / MSe - Fo 18.65 / 2.63 7.09
14Interpreting Our Results
- Our Fo of 7.09 is larger than our Fc of 3.22.
Thus, we can conclude that our results are
statistically significanct and the difference
between the number of tricks that tigers, rabbits
and pigeons can learn is not due to chance. - However, we do not know where this difference
exists. In other words, although we know there is
a significant difference between the groups, we
do not know which groups differ significantly
from one another. - In order to answer this question we must conduct
additional post-hoc tests. Specifically, we need
to conduct a Tukey HSD post-hoc test to determine
which group means are significantly different
from each other.
15Tukey HSD
- The Tukey test compares each group mean to every
other group mean by using the familiar formula
described for t tests in Chapter 9.
Specifically, it is the mean of one group minus
the mean of a second group divided by the
standard error, represented by the following
formula.
Where ng the number of cases in each group
- The Tukey HSD allows us to compares groups of
the same size. - Just as with our t tests and F values, we must
first determine a critical Tukey value to compare
our observed Tukey values with. Using Appendix
D, locate the number of groups we are comparing
in the top row of the table, and then locate the
degrees of freedom, error (dfe) in the left-hand
column (The same dfe we used to find our F
value).
16Tukey HSD for our Example
- Now lets use Tukey HSD to determine which
animals differ significantly from each other. - First Determine your critical Tukey Value using
Appendix D. There are three groups, and your dfe
is still 42. Thus, your critical Tukey value is
approximately 3.44. - Now, calculate observed Tukey values to compare
group means. Lets look at Tigers and Rabbits in
detail. Then you can calculate the other two
Tukeys on your own! -
Step 1 Calculate your standard error.
2.63/15 .18 v.175 .42
Step 2 Subtract the mean number of tricks
learned by rabbits by the mean number of tricks
learned by tigers. 12.3-3.1 9.2
Step 3 Divide this number by the standard
error. 9.2 / .42 21.90 Step 4 Conclude that
tigers learned significantly more tricks than
rabbits.
Where ng the number of cases in each group
Repeat this process for the other group
comparisons
17Interpreting our Results. . . Again
- The table to the right lists the observed Tukey
values you should get for each animal pair. - Based on our critical Tukey value of 3.42, all of
these observed values are statistically
significant. - Now we can conclude that each of these groups
differ significantly from one another. Another
way to say this is that tigers learned
significantly more tricks than pigeons and
rabbits, and pigeons learned significantly more
tricks than rabbits.
Animal Groups Being Compared Observed Tukey Value
Tigers and Rabbits 21.90
Rabbits and Pigeons 15.71
Pigeons and Tigers 6.19
181 Way ANOVA with Trend Analysis
- Trend Analysis
- Groups have an order
- Ordinal categories or ordinal groups
- Testing hypothesis that means of ordered groups
change in a linear or higher order (cubic or
quadratic)
19Factorial or 2 way ANOVA
- Evaluate combined effect of 2 experimental
variables (factors) on DV - Variables are categorical or nominal
- Are factors significant separately (main effects)
or in combination (interaction effects) - Examples
- How do age and gender affect the salaries of 10
year employees? - Investigators want to know the effects of dosage
and gender on the effectiveness of a
cholesterol-lowering drug
202 way ANOVA hypothesis
- Test for interaction
- Ho there is no interaction effect
- Ha there is an interaction effect
- Test for Main Effects
- If there is not a significant interaction
- Ho population means are equal across levels of
Factor A, Factor B etc - Ha population means are not equal across levels
of Factor A, Factor B etc
http//www.uwsp.edu/PSYCH/stat/13/anova-2w.htm
21Factorial ANOVA in Depth
- When dividing up the variance of a dependent
variable, such as hours of television watched per
week, into its component parts, there are a
number of components that we can examine - The main effects,
- interaction effects,
- simple effects, and
- partial and controlled effects
Main effects
Interaction effects
Simple effects
Partial and Controlled effects
22Main Effects and Controlled or Partial Effects
- When looking at the main effects, it is possible
to test whether there are significant differences
between the groups of one independent variable on
the dependent variable while controlling for, or
partialing out, the effects of the other
independent variable(s) on the dependent variable - Example - Boys watch significantly more
television than girls. In addition, suppose that
children in the North watch, on average, more
television than children in the South.
- Now, suppose that, in my sample of children from
the Northern region of the country, there are
twice as many boys as girls, whereas in my sample
from the South there are twice as many girls as
boys. This could be a problem. - Once we remove that portion of the total variance
that is explained by gender, we can test whether
any additional part of the variance can be
explained by knowing what region of the country
children are from.
23When to use Factorial ANOVA
- Use when you have one continuous (i.e., interval
or ratio scaled) dependent variable and two or
more categorical (i.e., nominally scaled)
independent variables - Example - Do boys and girls differ in the amount
of television they watch per week, on average?
Do children in different regions of the United
States (i.e., East, West, North, and South)
differ in their average amount of television
watched per week? The average amount of
television watched per week is the dependent
variable, and gender and region of the country
are the two independent variables.
24Results of a Factorial ANOVA
- Two main effects, one for my comparison of boys
and girls and one for my comparison of children
from different regions of the country - Definition Main effects are differences between
the group means on the dependent variable for any
independent variable in the ANOVA model. - An interaction effect, or simply an interaction
- Definition An interaction is present when the
differences between the group means on the
dependent variable for one independent variable
varies according to the level of a second
independent variable.
- Interaction effects are also known as moderator
effects
25Interactions
- Factorial ANOVA allows researchers to test
whether there are any statistical interactions
present - The level of possible interactions increases as
the number of independent variables increases - When there are two independent variables in the
analysis, there are two possible main effects and
one possible two-way interaction effect
26Example of an Interaction
- The relationship between gender and amount of
television watched depends on the region of the
country - We appear to have a two-way interaction here
- We can see is that there is a consistent pattern
for the relationship between gender and amount of
television viewed in three of the regions (North,
East, and West), but in the fourth region (South)
the pattern changes somewhat
Mean Amounts of Television Viewed by Gender and Region. Mean Amounts of Television Viewed by Gender and Region. Mean Amounts of Television Viewed by Gender and Region. Mean Amounts of Television Viewed by Gender and Region. Mean Amounts of Television Viewed by Gender and Region. Mean Amounts of Television Viewed by Gender and Region.
North East West South Overall Averages by Gender
Girls 20 15 15 10 15
Boys 25 20 20 25 22.5
Overall Averages 22.5 17.5 17.5 17.5
27Graph of an Interaction
28Example of a Factorial ANOVA
- We conducted a study to see whether high school
boys and girls differed in their self-efficacy,
whether students with relatively high GPAs
differed from those with relatively low GPAs in
their self-efficacy, and whether there was an
interaction between gender and GPA on
self-efficacy - Students self-efficacy was the dependent
variable. Self-efficacy means how confident
students are in their ability to do their
schoolwork successfully - The results are presented on the next 2 slides
29Example (continued)
SPSS results for gender by GPA factorial ANOVA. SPSS results for gender by GPA factorial ANOVA. SPSS results for gender by GPA factorial ANOVA. SPSS results for gender by GPA factorial ANOVA. SPSS results for gender by GPA factorial ANOVA.
Gender GPA Mean Std. Dev. N
Girl 1.00 3.6667 .7758 121
Girl 2.00 4.0050 .7599 133
Girl Total 3.8438 .7845 254
Boy 1.00 3.9309 .8494 111
Boy 2.00 4.0809 .8485 103
Boy Total 4.0031 .8503 214
Total 1.00 3.7931 .8208 232
Total 2.00 4.0381 .7989 236
Total Total 3.9167 .8182 468
These descriptive statistics reveal that
high-achievers (group 2 in the GPA column) have
higher self-efficacy than low achievers (group 1)
and that the difference between high and low
achievers appears to be larger among girls than
among boys
30Example (continued)
ANOVA Results ANOVA Results ANOVA Results ANOVA Results ANOVA Results ANOVA Results ANOVA Results
Source Type III Sum of Squares df Mean Square F Sig. Eta Square
Corrected Model 11.402 3 3.801 5.854 .001 .036
Intercept 7129.435 1 7129.435 10981.566 .000 .959
Gender 3.354 1 3.354 5.166 .023 .011
GPA 6.912 1 6.912 10.646 .001 .022
Gender GPA 1.028 1 1.028 1.584 .209 .003
Error 301.237 464 .649
Total 7491.889 468
Corrected Total 312.639 467
These ANOVA statistics reveal that there is a
statistically significant main effect for gender,
another significant main effect for GPA group,
but no significant gender by GPA group
interaction. Combined with the descriptive
statistics on the previous slide we can conclude
that boys have higher self-efficacy than girls,
high GPA students have higher self-efficacy than
low GPA students, and there is no interaction
between gender and GPA on self-efficacy. Looking
at the last column of this table we can also see
that the effect sizes are quite small (eta
squared .011 for the gender effect and .022 for
the GPA effect).
31Example Burger (1986)
- Examined the effects of choice and public versus
private evaluation on college students
performance on an anagram-solving task - One dependent and two independent variables
- Dependent variable the number of anagrams solved
by participants in a 2-minute period - Independent variables choice or no choice
public or private
Mean number of anagrams solved for four treatment groups. Mean number of anagrams solved for four treatment groups. Mean number of anagrams solved for four treatment groups. Mean number of anagrams solved for four treatment groups. Mean number of anagrams solved for four treatment groups.
Public Public Private Private
Choice No Choice Choice No Choice
Number of anagrams solved 19.50 14.86 14.92 15.36
32Burger (1986) Concluded
- Found a main effect for choice, with students in
the two choice groups combined solving more
anagrams, on average, than students in the two
no-choice groups combined - Found a main effect for public over private
performance - Found an interaction between choice and
public/private. Note the difference between the
public and private performance groups in the
Choice condition.
33Repeated-Measures ANOVA versus Paired t Test
- Similar to a paired, or dependent samples t test,
repeated-measures ANOVA allows you to test
whether there are significant differences between
the scores of a single sample on a single
variable measured at more than one time. - Unlike paired t tests, repeated-measures ANOVA
lets you - Examine change on a variable measured across more
than two time points - Include a covariate in the model
- Include categorical (i.e., between-subjects)
independent variables in the model - Examine interactions between within-subject and
between-subject independent variables on the
dependent variable
34Different Types of Repeated Measures ANOVAs (and
when to use each type)
- The most basic model
- One sample
- One dependent variable measured on an interval or
ratio scale - The dependent variable is measured at least two
different times - Example Measuring the reaction time of a sample
of people before they drink two beers and after
they drink two beers
Time 1 Reaction time with no drinks
Time, or trial
Time 2 Reaction time after two beers
35Different Types of Repeated Measures ANOVAs (and
when to use each type)
- Adding a between-subjects independent variable to
the basic model - One sample with multiple categories (e.g., boys
and girls American, French, and Greek), or
multiple samples - One dependent variable measured on an interval or
ratio scale - The dependent variable is measured at least two
different times. (Time, or trial, is the
independent, within-subjects variable) - Example Measuring the reaction time of men and
women before they drink two beers and after they
drink two beers
Time 1, Group 2 Reaction time of women after
two beers
Time, or trial
Time 2, Group 1 Reaction time of men after
two beers
Time 2, Group 2 Reaction time of women after
two beers
36Different Types of Repeated Measures ANOVAs (and
when to use each type)
- Adding a covariate to the between-subjects
independent variable to the basic model - One sample with multiple categories (e.g., boys
and girls American, French, and Greek), or
multiple samples - One dependent variable measured on an interval or
ratio scale - One covariate measured on an interval/ratio scale
or dichotomously - The dependent variable is measured at least two
different times. (Time, or trial, is the
independent, within-subjects variable) - Example Measuring the reaction time of men and
women before they drink two beers and after they
drink two beers, controlling for weight of the
participants
Time 1, Group 2 Reaction time of women after
two beers
Time, or trial
Covariate
Time 2, Group 1 Reaction time of men after
two beers
Time 2, Group 2 Reaction time of women after
two beers
37Types of Variance in Repeated-Measures ANOVA
- Within-subject
- Variance in the dependent variable attributable
to change, or difference, over time or across
trials (e.g., changes in reaction time within the
sample from the first test to the second test) - Between-subject
- Variance in the dependent variable attributable
to differences between groups (e.g., men and
women). - Interaction
- Variance in the dependent across time, or trials,
that differs by levels of the between-subjects
variable (i.e., groups). For example, if womens
reaction time slows after drinking two beers but
mens reaction time does not. - Covariate
- Variance in the dependent variable that is
attributable to the covariate. Covariates are
included to see whether the independent variables
are related to the dependent variable after
controlling for the covariate. For example, does
the reaction time of men and women change after
drinking two beers once we control for the weight
of the individuals?
38Repeated Measures Summary
- Repeated-measures ANOVA should be used when you
have multiple measures of the same interval/ratio
scaled dependent variable over multiple times or
trials. - You can use it to partition the variance in the
dependent variable into multiple components,
including - Within-subjects (across time or trials)
- Between-subjects (across multiple groups, or
categories of an independent variable) - Covariate
- Interaction between the within-subjects and
between-subjects independent predictors - As with all ANOVA procedures, repeated-measures
ANOVA assumes that the data are normally
distributed and that there is homogeneity of
variance across trials and groups.
39ANCOVA
- 1 way ANOVA with a twist
- Means not compared directly
- Means adjusted by a covariate
- Covariate is not controlled by researcher
- It is intrinsic to the subject observed
40Analysis of Covariance
- In ANOVA, the idea is to test whether there are
differences between groups on a dependent
variable after controlling for the effects of a
different variable, or set of variables - We have already discussed how we can examine the
effects of one independent variable on the
dependent variable after controlling for the
effects of another independent variable - We can also control for the effects of a
covariate. Unlike independent variables in
ANOVA, covariates do not have to be categorical,
or nominal, variables
41ANCOVA examples
- A car dealership wants to know if placing trucks,
SUVs, or sports cars in the display area impacts
the number of customers who enter the showroom.
Other factors may also impact the number such as
the outside weather conditions. Thus outside
weather becomes the covariate.
42Assumptions
- Covariate must be quantitative an linearly
related to the outcome measure in each group - Regression line relating covariate to the
response variable - Slopes of regression lines are equal
- Ho regression lines for each group are parallel
- Ha at least two of the regression lines are not
parallel - Ho all group means adjusted by the covariate
are equal - Ha at least 2 means adjusted by the covariate
are not equal