ANOVA - PowerPoint PPT Presentation

1 / 90
About This Presentation
Title:

ANOVA

Description:

ANOVA is an extension of the t-test... It allows us to compare several means, and ... Danny decides to empirically determine which gang has the coolest members. ... – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 91
Provided by: jasona1
Category:

less

Transcript and Presenter's Notes

Title: ANOVA


1
ANOVA
  • The stuff you should have learned your first
    semester of grad school but didnt

2
ANOVA
  • ANOVA is an extension of the t-test...
  • It allows us to compare several means, and and
    manipulate lots of independent variables

3
ANOVA
  • Regression individual differences
  • ANOVA Mean differences
  • individual differences still plays a role but the
    outcome is expressed as MEAN DIFFERENCES

4
What does the ANOVA tell us?
  • Null hypothesis
  • Like a t-test, ANOVA tests the null hypothesis
    that the means are the same
  • Alternative (experimental) hypothesis
  • Experimental hypothesis is that the mans differ
  • ANOVA is an omnibus test (it tests for an overall
    difference between groups)
  • Tells us that the group means are different
  • Doesnt tell us exactly which means differ

5
ANOVA Theory
  • We calculate how much variability there is
    between scores (you can think of this like the
    standard deviation of a measure in a total
    sample)
  • Total Sum of Squares (SST)
  • We then calculate how much of this variability is
    due to experimental manipulation, Model Sum of
    Squares (SSM). In Regression, we call this
    Explained or Regression Sum of Squares (SSREG).
  • ...and how much variance cannot be explained
  • Residual Sum of Squares (SSRES) or Error Sum of
    Squares (SSE)

6
ANOVA Theory
  • We compare the amount of variability explained by
    the Model (experiment) to the error in the model
    (individual differences) the F-ratio
  • If the model explains a lot more variability than
    it cant explain, then the experimental
    manipulation has had a significant effect on the
    outcome (DV)

7
SST Total variance in the data
SSM Variance explained by the Model
SSRES Error (unexplained by the Model)
  • If the experiment is successful, then the model
    will explain more variance than it cannot SSM is
    greater than SSRES

8
Calculating ANOVA by hand
  • Example Danny decides to empirically determine
    which gang has the coolest members. So he
    creates the Zuko-Kenickie Coolness Test (ZKCT)
    and administers it to 5 members from the T-birds,
    Pink Ladies, and Scorpions

9
(No Transcript)
10
The data
8 7 6 5 4 3 2 1 0
Mean 3
Grand Mean
Mean 2
Mean 1
0 1 2 3 4
11
Total Sum of Squares, (deviation of each case
from grand mean)
8 7 6 5 4 3 2 1 0
Grand Mean
0 1 2 3 4
Squared difference between each persons score
and the overall, or grand, mean
12
Model sum of squares (SSM) (Deviation from each
group mean from grand mean)
8 7 6 5 4 3 2 1 0
Grand Mean
0 1 2 3 4
Squared difference between each groups mean and
the overall, or grand, mean. This represents the
effect of the experimental manipulation (variance
explained by experimental group membership)
13
Residual sum of squares (SSRES) (Deviation from
each score from group mean)
8 7 6 5 4 3 2 1 0
Grand Mean
0 1 2 3 4
Squared difference between each persons score
and their group mean. When I am not AT my
groups mean, then my model (group membership) is
not doing a perfect job of explaining my score.
Thus, any deviation between my score and my
groups mean is taken as an indication of errora
failure of the model to precisely pinpoint my
score.
14
SUMS OF SQUARES HELL The foundation of regression
Bigger this gets
The smaller this gets
R2 SSR / SST
15
Schematic of Sums of Squares
16
BAD!!!!
17
GOOD!!!
18
Grand mean was 3.467 Grand SD was 1.767, Grand
variance was 3.124
19
Step 1 Calculate SST
  • SST S2grand(N-1)
  • ST 3.124(15-1)
  • 43.74

20
Step 2 Calculate SSM
  • SSM Sumni(Meangroup Meangrand)2
  • SSM 5(2.2-3.467)2 5(3.2 3.467)2 5(5.0
    3.467)2
  • SSM 5(-1.267)2 5(-0.267)2 5(1.533)2
  • SSM 8.025 0.355 11.755
  • 20.135

21
Step 3 Calculate SSRES
  • SSRES S2group1(n1-1) S2group2(n2-1)
    S2group3(n3-1)
  • (1.70)(5-1) (1.70)(5-1) 2.50(5-1)
  • 1.704 1.704 2.504
  • 6.8 6.8 10
  • 23.60

22
Step 4 Calculate Mean SS
  • SST 43.74
  • SSM 20.135
  • SSR 23.60
  • DFT N 1 14
  • DFm K (levels of IV) 1 2
  • SSR N K 12

MSM 20.135 / 2 10.067 MSR 23.60 / 12
1.967
23
Step 5 Calculate F-test
  • Ratio of variation explained by the model and the
    variation explained by unsystematic factors. It
    can be calculated by dividing the model mean
    squares by the residual mean squares

F MSm / MSR F 10.067 / 1.967 F 5.12
24
Assumptions
  • ANOVA is pretty robust to small sample sizes
  • Homogeneity of variance!
  • The variance within each level of the IV is
    statistically similar
  • If a group has a large variance, the deviations
    from the mean will be large (Larger error
    variance).
  • If it has a small variance, the deviations will
    be small (smaller error variance).
  • Large variance (outlier) can influence mean of
    group

25
Homogeneity of variance!
8 7 6 5 4 3 2 1 0
Grand Mean
0 1 2 3 4
26
Homogeneity of variance!
  • Levines test is the test we use to test if our
    ANOVA test violates Homogeneity of variance
  • Ways to avoid a problem
  • Large N in each cell (level of IV)
  • Equal sample sizes in each cell

27
SPSS EXAMPLE
  • There are two ways to do One-way ANOVA
  • FIRST WAY
  • ANALYZE ?COMPARE MEANS?ONE WAY ANOVA
  • SECOND WAY
  • ANALYZE?GENERAL LINEAR MODEL ? UNIVARIATE

28
SPSS EXAMPLE
  • IV Dementia Rating
  • 0 normal
  • .5 possible impairment
  • 1 probable impairment
  • DV EPCEE
  • Measure of everyday cognitive abilities
  • Research Question We know that demented elders
    perform more poorly on basic cognitive tests like
    memory. This study wanted to determine if they
    also performed more poorly on EVERYDAY kinds of
    cognitive tasks

29
(No Transcript)
30
(No Transcript)
31
Always ask for the means of your IV
This gives you the Levines test
32
Eta-squared
  • Each significant main effect and interaction will
    have its own effect strength (measure of
    association with the multivariate dependent
    variable)
  • More on this later
  • h2effect 1 - Leffect

33
Check your descriptives to make sure everything
is groovy
Levines is NS so we did not violate homogeneity
of variance
34
Main effect of CDR is significant
54105.565 / 43.856 97.712
35
Writing that up APA style
  • Results from a one-way ANOVA indicated that there
    was a significant difference in EPCCE performance
    across the three levels of dementia rating, F(2,
    770) 97.71, p lt .05, .

36
The means are presented here. But which ones are
different from one another?
37
Omnibus test
  • A significant ANOVA for an IV with 3 levels
  • One possibility is that all means are
    significantly different
  • A second possibility is that the means of groups
    1 and 2 are the same but 3 is different
  • Another possibility is that groups 2 and 3 are
    the same but 3 is different
  • Another possibility is that groups 1 and 3 are
    the same but 2 is different

38
So we have a significant difference between the
three groups, now what?
  • We have to determine which means are different
    from one another
  • You have a few options!

39
Follow-up tests
  • Multiple t-tests
  • Orthogonal planned contrasts/comparisons
  • hypothesis driven
  • planned a priori
  • Post-hoc tests
  • Not planned (no hypothesis)
  • compare all pairs of means
  • the family-wise error problem

40
Commonly used planned contrasts/comparisons
  • Simple- Each level of the IV is compared to a
    reference level (first or last)
  • 1 vs 2
  • 1 vs 3
  • Difference-Each level is compared to the mean of
    the previous categories
  • 2 vs 1
  • 3 vs mean of 1 and 2

41
Commonly used planned contrasts/comparisons
  • HelmertEach level is compared to the mean of the
    subsequent levels
  • 1 vs mean of 2 and 3
  • 2 vs 3
  • The one you choose depends on the question you
    want to answer
  • I want to know if normal is significantly
    different from any dementia AND do the 2 dementia
    groups differ

42
This drop down menu gives you all the contrasts
43
(No Transcript)
44
Post-hoc tests
  • There are a TON to choose from
  • Assumptions met REGWQ or Tukey HSD
  • Safe option Bonferroni
  • Unequal sample sizes Gabriels (small) or
    Hochbergs GT2 (large)
  • Unequal variances Games-Howell

45
Bonferroni alpha alpha
number of tests
Use these if Levines is sig
46
All pairwise comparisons are listed here
47
Writing that up APA style
  • Results from a one-way ANOVA indicated that there
    was a significant difference in EPCCE performance
    across the three levels of dementia rating, F(2,
    770) 97.17, p lt .05.
  • Post-hoc tests revealed that all three levels of
    CDR significantly differed from one another (show
    table)

48
Time for Computer Lovin
49
What is the 2-way independent ANOVA?
  • Two independent variables
  • 2-way 2 IVs
  • 3-way 3 IVs
  • 4-way 4 IVs
  • Different participants in all conditions
  • Independent different participants, just like
    in the t-test
  • Several independent variables is known as a
    factorial design

50
Benefit of factorial design
  • We can look at how variables interact
  • Interactions
  • show how the effect of one IV might depend on
    levels of the other
  • are often more interesting than main effects

51
Example of Interaction
  • Interaction between prior emotional state
    (positive and negative) and lecture content
    (statistical versus pragmatic) on well-being
    during lecture
  • a normal reduction in well-being when moving from
    statistical to pragmatic may be exacerbated in
    persons in a negative affective state

52
SPSS Example
  • Testing the effects of alcohol and gender on the
    beer goggles effect
  • IV Alcohol None, 2 pints, 4 pints
  • IV Gender Male, Female
  • Dependent measure was an objective measure of the
    attractiveness of the partner selected at the end
    of the evening

53
SST Total variance in the DV
SSRES Total residual, error variance
SSM Total variance explained by the model
SSA Effect of alcohol
SSAxB Effect of interaction
SSB Effect of gender
54
(No Transcript)
55
(No Transcript)
56
It is always good to ask for the means!
57
(No Transcript)
58
Check it!!!
No violation of the Homogeneity of Variance
assumption
59
There was a non-significant influence of gender
on date attractiveness F(1,42) 2.03, p .16.
When we ignore how much alcohol was consumed, the
gender of the subject didnt matter much in terms
of the attractiveness of the date.
60
There was a significant main effect of the amount
of alcohol consumed F(2,42) 20.07 , p lt .001
. When we ignore whether the subject was male
or female, the amount of alcohol consumed
influenced the attractiveness of their date
selection
61
Interaction
  • There was a significant interaction between the
    gender of a subject and the amount of alcohol
    consumed in the attractiveness of date selection
    (F2,42 11.91, p lt .001).
  • What this actually means is that the effect of
    alcohol on date selection was different for males
    and females
  • The plot gives us an idea about the nature of
    this interaction

62
Post-hoc tests
63
At four pints, you pick significantly less
attractive partners than at lower levels of
consumption.
64
What about the interaction? Always graph it!!
65
(No Transcript)
66
This is the analysis we have done
  • UNIANOVA
  • attract BY gender alcohol
  • /METHOD SSTYPE(3)
  • /INTERCEPT INCLUDE
  • /POSTHOC alcohol ( BONFERRONI )
  • /PLOT PROFILE( alcoholgender )
  • /EMMEANS TABLES(gender)
  • /EMMEANS TABLES(alcohol)
  • /EMMEANS TABLES(genderalcohol)
  • /PRINT DESCRIPTIVE ETASQ HOMOGENEITY
  • /CRITERIA ALPHA(.05)
  • /DESIGN gender alcohol genderalcohol .

67
Now, decomposing the interaction
Youll find useful tips like this at
http//www.utexas.edu/cc/faqs/stat/spss/spss50.htm
l
  • UNIANOVA
  • attract BY gender alcohol
  • /METHOD SSTYPE(3)
  • /INTERCEPT INCLUDE
  • /POSTHOC alcohol ( BONFERRONI )
  • /EMMEANS TABLES(genderalcohol) COMPARE
    (gender)
  • /EMMEANS TABLES(genderalcohol) COMPARE
    (alcohol)
  • /PRINT DESCRIPTIVE ETASQ HOMOGENEITY
  • /CRITERIA ALPHA(.05)
  • /DESIGN gender alcohol genderalcohol .
  • EXECUTE.

68
/EMMEANS TABLES(genderalcohol) COMPARE
(gender)
This shows us that there is a significant male
female difference in date attractiveness, but it
emerges only at the 4-pint level.
69
/EMMEANS TABLES(genderalcohol) COMPARE
(alcohol)
This is informative, because it shows us that
four pints are significantly different from no-
or two-pints, BUT, that these pairwise
differences are actually only true for males!
Men end up with SIGNIFICANTLY less attractive
dates after 4 pints, women dont.
70
(No Transcript)
71
Describing the Interaction
  • There was a significant interaction between the
    gender of a subject and the amount of alcohol
    consumed in the attractiveness of date selection
    (F2,42 11.91, p lt .001).
  • The interaction is presented in figure 1.
    Post-hoc tests indicated that males selected
    significantly less attractive partners when they
    had four pints (insert mean) relative to no
    drinks (insert means) while the same pattern was
    not found for females. Moreover, there was a
    significant difference between males and females
    only in the 4-pint condition with males selecting
    significantly less attractive partners (insert
    means) than female participants (insert means)

72
Covariates
  • Covariate should be at least moderately
    correlated with the DV
  • Power is gained only if the COV accounts for a
    substantial amount of variance
  • Multiple covariates should be un-correlated with
    one another
  • Relationship between the COV and DV should be
    linear
  • Otherwise the COV predicts less variance

73
ANCOVA
  • An extension of ANOVA in which main effects and
    interactions are assessed on DV scores after the
    DV has been adjusted for by the DVs relationship
    with one or more Covariates (CVs)

74
3 Uses of ANCOVA
  • Increase test sensitivity
  • Reduce the error term and increase the
    significance of our main effects and interactions
  • (2) Look at mean differences REMOVING the effect
    of another variable or variables
  • (3) If you find a difference among means you
    might use ANCOVA to EXPLAIN the differences

75
Use 1 Removing Error
  • In both regression and ANOVA, a common issue is
    the residual.
  • In regression, we talked about that in terms of
    UNEXPLAINED VARIANCE in the dependent variable
  • In ANOVA, we said that UNEXPLAINED VARIANCE
    constitutes the denominator of the F-ratio. The
    bigger the residual, the smaller the
    F-ratio...the less likely it will be significant

76
Use 1 Removing Error
  • If we include more predictors in any model, it
    will tend to EXPLAIN MORE VARIANCE. In other
    words, it will tend to reduce the denominator of
    ALL F-ratios
  • In other words, including covariates will tend to
    help us get more significant results.

77
Use 1 Removing Error
SST Total variance in the data
SSM Variance explained by the Model
SSRES Error (unexplained by the Model)
ANCOVA reduces this term
78
Use 2 Controlling for Other Variables
  • Are there differences among groups after
    controlling for another variable(s)?
  • If everyone scored the same on the covariate is
    there still a difference among the groups?
  • By controlling known confounds, we gain greater
    insight into the unique effects of our
    experimental DVs

79
Use 3 Explaining Differences
  • Suppose we were to compare mens and womens
    average faculty salary at NC-State
  • Looking for a difference could involve an ANOVA
  • Trying to explain that difference could involve
    an ANCOVA

80
Use 3 Explaining Differences
  • As in regression, in explaining the difference
    between men and women, we want to account for
    certain variables
  • experience
  • rank
  • performance record
  • etc.
  • That is, we would like to show that a salary
    difference is due to performance variables. If
    it is not, then we open the possibility that
    there is a serious bias problem.
  • So, just like regression, we want to see if an IV
    remains important after controlling for another
    variable

81
ANCOVA Example Viagra
  • A researcher looked at the effects of zero, low,
    and high doses of viagra on libido
  • The researcher found that a significant Main
    Effect for dosage (low and high dosage had higher
    libido)
  • What other factors might affect libido?
  • partners libido
  • interacting medications
  • We could conduct the SAME experiment, but this
    time measure the partners libido over the same
    period, following the dose of viagra

82
Viagra study, controlling for partners libido
  • The Viagra experiment
  • the DV was an objective measure of libido
  • What might we expect to happen?
  • (Dose response hypothesis)

83
Not controlling for partner libido
84
(No Transcript)
85
Only thing that is different is you put in the
COVARIATE!!
86
(No Transcript)
87
Notice how much relatively more important the
partner is than the viagra dose!
88
Means before covariate
Means with covariate adjusted
NOW...notice that the means are all different.
Why? Because these are now covariate adjusted
means. We see that, in the earlier analysis, we
underestimated the effect of low dose viagra, and
overestimated the effect of high dose viagra.
The partner is very influential on this DV not
controlling for the partner mis-estimates the
impact of the IV.
89
Without covariate
With covariate
90
Without covariate
With covariate
Write a Comment
User Comments (0)
About PowerShow.com