ANOVA - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

ANOVA

Description:

Unlike Gosset, the inventor of analysis of variance Sir Ronald Fisher has been ... the chalkboard or an overhead projector or Teacher B using the chalkboard or ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 22
Provided by: drmarkjk
Category:
Tags: anova | chalkboard | inventor | is | of | the | who

less

Transcript and Presenter's Notes

Title: ANOVA


1
ANOVA
  • HED 489 Biostatistics

2
Randomized Design ANOVA
  • Unlike Gosset, the inventor of analysis of
    varianceSir Ronald Fisherhas been inextricably
    linked to his invention. It is not by accident
    that the value calculated by ANOVA is called "F."
    In addition, again unlike Gosset, Fisher was a
    trained mathematician and worked as a
    statistician. About 1920, while working in
    agricultural research, Fisher developed analysis
    of variance. While Gosset was forced, by the
    nature of his brewery employer, to work with
    small samples, Fisher was under no such
    constraints. He was able to compare the effect of
    large numbers of various combinations of
    agricultural independent variables on his
    dependent variable, yield. All Fisher needed was
    farmland, of which England had plenty. What
    Fisher didn't have before he invented ANOVA, was
    a way to compare mean yield among three or more
    different plots of land.
  • So, while there are many different types of
    analysis of variance, there are two things that
    set it apart from t-test you can analyze the
    effect of more than one independent variable, and
    you can study more than two samples. For example,
    with t-test, you could determine whether students
    learned more from Teacher A or Teacher B. But,
    you couldn't determine whether they learned more
    from Teacher A using the chalkboard or an
    overhead projector or Teacher B using the
    chalkboard or an overhead projector. These are
    the kinds of studies that ANOVA was invented to
    analyze.
  • This first form of ANOVA is called "Randomized,"
    or "Completely Randomized." It assumes there is
    only one independent variable and, while it is
    usually shown being used with three or more
    groups, there is nothing to stop you from using
    it with two groups. Think of the Randomized ANOVA
    design as a large-sample t-test.

3
(No Transcript)
4
Factorial Design ANOVA
  • A factorial design analysis of variance allows
    you to compare the effect of more than one
    independent variable on your dependent variable
    or compare the effect of more than one level of
    independent variable on your dependent variable.
    If you look at the table, notice that we have two
    independent variablesthe teacher, and the method
    they use to teach. Note that each teacher will
    use each method. The six cells represent mean
    test scores for six different groups of students.
    Each group needs to have at least 20 students in
    order to use analysis of variance. It is
    important that you take particular note of how
    quickly the required number of students increases
    as you add independent variables or levels of
    independent variables. This has no statistical
    implications, but it sure makes it harder to
    conduct the study!

5
  • Analysis of variance, or "ANOVA" as it's commonly
    called, is a parametric test of differences you
    will use when
  • You have more than two groups or samples of data
  • You have one or two large groups of data, say
    more than 25 per group
  • You have more than one independent variable
  • Your independent variable has more than one
    level.
  • You probably understand the first three
    conditions but perhaps not the fourth. For
    instance, you could use t-test to determine
    whether there is any difference in weight loss
    between people who exercise and those who don't.
    But, if you want to know whether there's any
    difference among those who exercise a little,
    those who exercise a lot, and those who don't
    exercise at all, you can't use t-test. See why
    not? Because you have three groups of data, and
    you can't use t-test with more than two groups.
    Enter ANOVA.

6
Types of ANOVAs
  • There are a LOT of different ways to use ANOVA.
    However, we're going to study only two
    typesCompletely Randomized Design, and
    Two-Factor Design. These are sufficiently
    complicated and once you've learned them, you'll
    have a good handle on ANOVA. It doesnt matter if
    your sample includes three, four, five, or more
    groups. The data are arrayed like an extended
    t-test 2independent samples.

7
(No Transcript)
8
What ANOVA Does
  • Look at the two little groups of data at the
    right. Consider how these data can vary
  • Within each group, the values can vary from each
    other, and they do no two values in either group
    are the same
  • Between each group, the mean of the groups can
    vary, and, again, in this example, they do
  • And finally, totally, each value can vary from
    every other value in both groups.
  • The question is are these variances sufficiently
    large such that we can assume the differences did
    not occur by chance alone? That is, are they
    statistically-significantly different. By
    calculating and comparing these three sources of
    variancetotal, between, and withinwe can make
    this determination.
  • If you're confused, let me try to help you. Look
    at the little table below. Assume it represents
    the number of right answers from a 10-item quiz
    for three groups of students who were taught
    statistics by three different methods.

9
  • Now, do you see any variance in these scores?
    (The correct answer is, "No, I don't see any
    variance in these scores.") There isn't any. Is
    there any differences in how well the three
    groups did? Of course not, they each did equally
    wellperfectly well, in fact. You don't have to
    be a rocket scientist to be able to figure out
    that there's no statistical difference among the
    three groups, right? You accept the null
    hypothesisno difference in teaching method.
  • The chances of this happening are, obviously,
    very small. And, the greater the differencesthat
    is, variancethe greater the chances of
    statistical significance. But, how much variance
    is needed to reach statistical significance?
    That's where ANOVA comes in.

10
Degrees of Freedom
  • Before we get more information about ANOVA we
    need to speak about Degrees of Freedom. Degrees
    of Freedoms (DF) is important to help determine
    whether there is a difference between groups.

11
Calculating Degrees of Freedom
12
The Formula
  • In just a second, you're going to see the ANOVA
    formula. It's pretty big, but it's not that bad.
    However, be sure you write down the notation or
    you'll be lost until you get familiar with it. An
    example or two
  • If you see the summation sign followed by j, you
    would sum the scores in each group.
  • If you see the summation sign followed by ij, you
    would sum all the scores, across all groups.

13
  • In just a second, you're going to see the ANOVA
    formula. It's pretty big, but it's not that bad.
    However, be sure you write down the notation or
    you'll be lost until you get familiar with it. An
    example or two
  • If you see the summation sign followed by j, you
    would sum the scores in each group.
  • If you see the summation sign followed by ij, you
    would sum all the scores, across all groups.

14
The Formula
15
  • I know the formula looks a little spooky, but
    look at the formula for standard deviation. Now,
    look at the ANOVA formula. Isn't it true that the
    numerator and the two major areas in the
    denominator are very similar to the standard
    deviation formula. What we're really doing in
    calculating three variances, subtracting one from
    the other in the denominator, and dividing it
    into the numerator.
  • There are three steps in calculating ANOVA, and
    these are shown to the left of the formula
  • Because we do so much squaring and summing of
    values, the first step is to calculate "Sum of
    Squares" for the total, between groups, and
    within groups variance.
  • We then calculate the average or "Mean Square"
    variances by dividing each Sum of Squares by its
    respective degrees of freedom essentially, by
    the number of values associated with each source
    of variance.
  • Finally, we divide the Mean Square Between Groups
    variance by the Mean Square Within Groups to
    calculate F. Because of the different values you
    calculate, you construct a "Summary Table" that
    show Sum of Squares, Mean Squares, degrees of
    freedom, F or F's if more than one, and the
    probability level or levels. From there, it's a
    simple matter to go to the F table and check for
    significance.
  • I'm going to treat ANOVA just like I did t-test.
    I'm going to take you through each step,
    explaining what's being calculated along the way.
    I'll use the example in the book and repeat the
    values calculated at each step in the middle
    frame at the right. BUT REMEMBERthis can be
    done using Excel!

16
Step 1
  • The example the book uses sets out to determine
    if different levels of shocking movies result in
    any difference in the amount of mistakes students
    will make during a driving simulation.
  • The null hypothesis for this study is
  • H0 There is no difference in the amount of
    mistakes made during the driving simulation among
    different levels of shocking movies.
  • The alternative hypothesis is
  • H1 There is a difference in the amount of
    mistakes made during the driving simulation among
    different levels of shocking movies.

17
This particular driver education teacher wants to
see if watching a video with various levels of
shock will reduce mistakes on the driving
stimulator. One film has no shock valuethe
other three have varying levels of shock, with
the high shock showing graphic blood, accidents,
and dismembered body parts due to poor driving.
18
  • Click here to go through an excel demonstration
    on how to determine the F in this case. Note the
    bottom has folders with the Steps. Start with
    Step 1, end with Step 13

19
  • Go find the value associated with 3 and 44
    degrees of freedom. Note that this table doesn't
    have the value for 44 degrees of freedom. When
    this happens, use the next smaller degrees of
    freedom. This will result in larger critical
    values, and will minimize your risk of Type 1
    error.
  • When you get to the cell that intersects df1 and
    df2, look at the value associated with the
    probability that you've selected. Usually, this
    will be .05. If the value you calculated is
    greater than the tabled value, you have found a
    statistically-significant difference.
  • The reason tabled degrees of freedom are called 1
    and 2, rather than b and w, is because other,
    more-complicated forms of ANOVA actually produce
    more than one F-ratio and, therefore, different
    pairs of degrees of freedom that have to be
    applied to the F-distribution table.
  • Finally, notice that although the example in
    Kuzma probably used a probability level of .05,
    you report the greatest level of probability that
    results. The calculated F-ratio in this example
    is 14.71, and the largest critical value under df
    3 and 44 is 6.60 at p.001. So, while the
    calculated F is larger than the critical value at
    p.05, it's also larger than the value at p.001.
    Therefore, you report that your calculated ratio
    is significant at plt.001.
  • Before we forget, you would reject the null
    hypothesis, accept the alternative hypothesis,
    and conclude that different shock levels do,
    indeed, have an impact on problem solving.

20
  • OK, now what idiot would do this by hand when we
    have computers. So, lets go through how we can
    do this via Excel.
  • Click here to go to that tutorial (if you are
    using Office 2007, click here).
  • After reviewing the above tutorial, go to next
    slide for assignment.

21
  • Click here to go to an excel file with data.
    Calculate the results.
Write a Comment
User Comments (0)
About PowerShow.com