Testing Multiple Means and the Analysis of Variance - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Testing Multiple Means and the Analysis of Variance

Description:

Situations where comparing more than two means is important. ... Vegetarians. Meat & Potato. Eaters. Random. Sample. Cholesterol Levels. Random. Sample. Random ... – PowerPoint PPT presentation

Number of Views:41
Avg rating:3.0/5.0
Slides: 22
Provided by: KennethM162
Category:

less

Transcript and Presenter's Notes

Title: Testing Multiple Means and the Analysis of Variance


1
Testing Multiple Means and the Analysis of
Variance
  • Situations where comparing more than two means is
    important.
  • The approach to testing equality of more than two
    means.
  • Introduction to the analysis of variance table,
    its construction and use.

2
Study Designs and Analysis Approaches
  • Simple Random Sample from a population with known
    s - continuous response.
  • Simple Random Sample from a population with
    unknown s - continuous response.
  • Simple RandomSamples from 2 popns with known s.
  • Simple Random Samples from 2 popns with unknown s.
  • One sample z-test.
  • One sample t-test.
  • Two sample z-test.
  • Two sample t-test.

3
Sampling Study with tgt2 Populations
One sample is drawn independently and randomly
from each of t gt 2 populations.
Objective to compare the means of the t
populations for statistically significant
differences in responses.
Initially we will assume all populations have
common variance, later, we will test to see if
this is indeed true. (Homogeneity of variance
tests).
4
Sampling Study
Vegetarians
Meat Potato Eaters
Health Eaters
Random Sample
Random Sample
Random Sample
Cholesterol Levels
5
Experimental Studywith tgt2 treatments
Experimental Units nt samples are independently
and randomly drawn from one population. Because
of this, we can safely assume that each sample
has the same mean and variance.
Separate treatments are applied to each sample.
A treatment is something done to the experimental
units which would be expected to change the
distribution (usually only the mean) of the
response(s).
6
Experimental Study
Male College Undergraduate Students
Veg. Diet
Health Diet
Random Sampling
M P Diet
Set of Experimental Units
Set of Experimental Units
Set of Experimental Units
Responses
7
Hypothesis
Let mi be the true mean of treatment group i (or
population i ).
Hence we are interested in whether all the groups
(populations) have exactly the same true means.
The alternative is that some of the groups
(populations) differ from the others in their
means.
8
A Simple Model
Let yij be the response for experimental unit j
in group i. i1,2, ..., t j1,2, ..., ni
Another way of saying that we expect the group
mean to be mi.
Let eij yij - mi be the residual/ deviation
from the group mean.
Each population has normally distributed
responses around their own means, but the
variances are the same across all populations.
Assuming yij N(mi, s2), then eij N(0, s2)
If H0 holds, yij m0 eij , that is, all groups
have the same mean and variance.
9
A Naïve Testing Approach
Test each possible pair of groups by performing
all pair-wise t-tests.
  • Assume each test is performed at the a0.05
    level.
  • The probability of not rejecting Ho when Ho is
    true is 0.95 (1-a).
  • The probability of not rejecting Ho when Ho is
    true for all three tests is (0.95)3 0.857.
  • Thus the true significance level for the overall
    test of no difference in the means will be
    1-0.857 0.143, NOT the a0.05 level we thought
    it would be.

1
In each individual t-test, only part of the
information available to estimate the underlying
variance is actually used. This is inefficient -
WE CAN DO MUCH BETTER!
2
10
Testing Approaches - Analysis of Variance
The term analysis of variance comes from the
fact that this approach compares the variability
observed among sample means to a pooled estimate
of the variability among observations within each
group.
11
Extreme Situations
12
Pooled Variance
From two-sample t-test with assumed equal
variance, s2, we produced a pooled
(within-group) sample variance estimate.
13
Variance among Group Means
Consider the variance among the group means
computed as
If we assume each group is of the same size, say
n, then under H0, s is an estimate of s2/n.
Hence, n times s is an estimate of s2. When the
sample sizes are unequal, the estimate is given
by.
14
F-test
Now we have two estimates of s2. An F-test can
be used to determine if the two statistics are
equal. Note that if the groups truly have
different means, sb2 will be greater than sw2.
Hence the F-statistics is written as
If H0 holds, the computed F-statistics should be
close to 1. If HA holds, the computed F-statistic
should be much greater than 1. We use the
appropriate critical value from the F - table to
help make this decision.
Hence,the F-test is really a test of equality of
means under the assumption of normal populations
and homogeneous variances.
15
Partition of Sums of Squares
SSB
SSW


TSS
Total Sums of Squares
Sums of Squares Between Means
Sums of Squares Within Groups


16
The AOV (Analysis of Variance) Table
The computations needed to perform the F-test for
equality of variances are organized into a table.
17
Example-Excel
average(b6b10) var(b6b10) sqrt(b13) count(b6
b10) (B15-1)B13
(sum(B15D15)-1)var(B6D10) sum(b16d16) b18-b
19
18
Excel Analysis Tool Pac
19
Example SAS
proc anova
class popn
model resp popn
title 'Table 13.1 in Ott -
Analysis of Variance' run


Table 13.1 in Ott - Analysis of Variance
31

Analysis of Variance Procedure


Dependent Variable RESP

Sum of Mean
Source DF
Squares Square F Value Pr gt F

Model
2 2.03333333 1.01666667 5545.45
0.0001
Error
12 0.00220000 0.00018333


Corrected Total 14 2.03553333


R-Square
C.V. Root MSE RESP Mean


0.998919 0.247684 0.013540
5.466667


Source
DF Anova SS Mean Square F Value
Pr gt F
POPN
2 2.03333333 1.01666667
5545.45 0.0001


20
GLM in SAS


General Linear Models Procedure


Dependent Variable RESP

Sum of Mean
Source DF
Squares Square F Value Pr gt F

Model
2 2.03333333 1.01666667 5545.45
0.0001
Error
12 0.00220000 0.00018333


Corrected Total 14 2.03553333


R-Square
C.V. Root MSE RESP Mean


0.998919 0.247684 0.013540
5.466667


Source
DF Type I SS Mean Square F Value
Pr gt F
POPN
2 2.03333333 1.01666667
5545.45 0.0001

Source DF Type III SS
Mean Square F Value Pr gt F

POPN 2
2.03333333 1.01666667 5545.45 0.0001




T for H0 Pr gt T Std Error of
Parameter Estimate
Parameter0 Estimate

INTERCEPT
5.000000000 B 825.72 0.0001
0.00605530 POPN 1
0.900000000 B 105.10 0.0001
0.00856349 2
0.500000000 B 58.39 0.0001
0.00856349 3
0.000000000 B . . .

NOTE The
X'X matrix has been found to be singular and a
generalized inverse was used to solve
the normal equations. Estimates followed by the
letter 'B' are biased, and are not
unique estimators of the parameters.
proc glm
class popn
model resp popn / solution
title 'Table 13.1 in Ott
run
21
Minitab Example
STAT gt ANOVA gt OneWay (Unstacked)
One-way Analysis of Variance Analysis of
Variance Source DF SS MS
F P Factor 2 2.033333 1.016667
5545.45 0.000 Error 12 0.002200
0.000183 Total 14 2.035533
Individual 95 CIs For Mean
Based on Pooled
StDev Level N Mean StDev
--------------------------------- EG1
5 5.9000 0.0158
( EG2 5 5.5000 0.0071
) EG3 5 5.0000 0.0158
(
--------------------------------- Pooled
StDev 0.0135 5.10 5.40
5.70 6.00
Write a Comment
User Comments (0)
About PowerShow.com