CEP 933: Planned orthogonal comparisons - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

CEP 933: Planned orthogonal comparisons

Description:

When we do trend tests we have one big benefit someone has saved us the ... CEP 933: Trend tests. Suppose we have 3 groups given 10, 20, and 30 minutes. ... – PowerPoint PPT presentation

Number of Views:142
Avg rating:3.0/5.0
Slides: 34
Provided by: Bet32
Learn more at: https://www.msu.edu
Category:

less

Transcript and Presenter's Notes

Title: CEP 933: Planned orthogonal comparisons


1
CEP 933 Planned orthogonal comparisons
To test POCs we use first construct a contrast
and its standard error. Recall our notation for
the contrast is L S cj We also
need SE(L) ? MSW(S cj2/nj) which is the
standard error of a contrast. This standard
error is determined by the coefficients used in
the contrast, and by the sample sizes. Note the
similarity of this formula to the formula for the
standard error of a mean or of the difference
between two means.
2
CEP 933 Planned orthogonal comparisons
To do the planned comparisons we can use either a
critical t (tc) (from the "real" t table) or a
critical F to do the test. We would use either a
tc with dfwithin degrees of freedom or Fc(1,
dfwithin) as the critical value (if we use F we
must square the contrast sample t value). To
use a t test we compute t L/SE(L) and we
will compare this to a critical t.
3
CEP 933 Planned orthogonal comparisons
Squaring the contrast t value gives us an F test
that we can put into the anova table, and we can
obtain the appropriate SS and MS terms as well
(which are equal since each contrast test has
only 1 degree of freedom). Or we can get F
by computing sums of squares. We get SScontrast
MScontrast by dividing the squared contrast L2 by
the sum of weight terms used in the standard
error of the contrast.
SScontrast
4
CEP 933 Planned orthogonal comparisons
Thus SScontrast L2 / (S cj2 / nj) Since
there are k-1 contrasts there will be k-1 of
these "contrast sums of squares". Each contrast
has one degree of freedom. The values of the k-1
contrast sums of squares should sum to
SSBetween. If we use the t-test approach to do
contrasts for our data, we find that the critical
t would have 45 degrees of freedom, so if we want
a test at the .05 level, wed use t 2.01
(approximately). Lets make the contrasts and the
test statistics.
5
CEP 933 Planned orthogonal comparisons
A Interactive versus standard video 1 vs.
3 LA - 48.7 -
47.2 1.5 SE(LA) ? MSW(S cj2/nj) SE(LA)
? (28.8 x 1/10 0/10 (-1)2/10 0/10 0/10
) 2.4 So tA 1.5/2.4 0.62 This is not
significant because it is less than the critical
t 2.01. The two types of video presentation
show the same average achievement results in the
population of students. So we retain H0 (m1
m3)/2
Coefficients2 cj2 are on top
njs are below
6
CEP 933 Planned orthogonal comparisons
B Video (I or S) versus all others 1 and 3
vs. rest LB ( )/2 - (
)/3 (48.7 47.2)/2 -
(43.4 36.7 40.3)/3 47.95 - 40.13
7.82 SE(LB) ?28.8 0.52/10 0.332/10
0.52/10 0.332/10 0.332/10 So
tB 7.82/ ? ( 28.8 x 0.083 ) 7.82/1.54
5.08 This is significant because 5.08 is greater
than tc 2.01. We reject the idea that the
population mean of the video groups is equal to
the mean of the other groups combined. That is,
we reject H0 (m1 m3)/2 (m2 m4 m5)/3.
7
CEP 933 Trend tests
Trend tests are a special case of planned
comparisons. However, to use trend tests you
must have a factor that is "quantitative," such
as amount of time allowed for doing a task, age
of a subjects, amount of a drug, etc.
Furthermore, the treatment levels must be "evenly
spaced" for the trend tests to work properly.
We could not use simple trend tests to look at
a trend across three groups given 10, 50 and 100
minutes to do a task, but we could look at a
trend across 3 groups given 10, 20, and 30
minutes. When we do trend tests we have one big
benefit someone has saved us the trouble of
determining the contrast weights for these tests!
See Appendix Polynomial in our book.
8
CEP 933 Trend tests
Suppose we have 3 groups given 10, 20, and 30
minutes. The contrast weights for the 3-group
allow us to examine linear and quadratic trends.
The weights (cj) are Group 10 m 20 m 30
m Linear -1 0 1 Quadratic 1 -2 1 If we
plot these weights as Y with the minutes
allocated to the groups as our X variable we can
see the trend that is tested.
9
CEP 933 Trend tests
Weight Group 10 m 20 m 30
m Linear -1 0 1 Quadratic
1 -2 1
10
CEP 933 Error rates
Planned comparisons have a per-contrast error
rate, and most researchers simply use the rate a
(say a.05) for each contrast, because we usually
dont do a lot of planned comparisons. If we
want to do a large number of such tests, we may
want to reduce our error rate by setting the rate
for each contrast to a fraction of the target
rate a, say, a/c where c is the number of tests
we are doing. So if we want to keep the error
rate to .05 and we have 3 tests, we would use a
.05/3 .017 as the significance level for each
test. This is called a Bonferroni correction to
the error rate. Our book does the work for us
and p. 751 has a table of values (called t) that
will give us experiment-wise levels of .05 and
.01 for a variety of numbers of contrasts.
11
CEP 933 Post hoc tests
Tukey and Newman-Keuls tests of all pairs of
means Tukey and Newman-Keuls tests use a
critical value from the Studentized range
statistic qr to create a "critical mean
difference." The value of qr itself is similar
to a t test, but we use the critical mean
difference to do our tests so that we can save
the work of computing all of the test statistics
for the k(k-1)/2 pairs of mean differences that
we will test. Also q depends on r, the number
of means within the set we are testing (or
technically, the number of steps between the
means being compared well hear more on this
later).
12
CEP 933 Post hoc tests Tukey and Newman-Keuls
The difference between the Tukey and Newman-Keuls
procedures is in the values of r for qr that are
used. More specifically, the degrees of freedom
used to select the critical q values differ, as
we will see below. Also here is a word about
notation. We will use the term ncell to mean the
size of a group in the anova. If the group
sizes nj are unequal, we will need to compute an
average value of nj to use in these techniques.
We usually do NOT use the ordinary arithmetic
mean. Instead we use the harmonic mean (denoted
below).
13
CEP 933 Post hoc example
EXAMPLE This example goes through the steps
needed to conduct these two kinds of post hoc
test. First are some general directions, then
the directions for the Tukey and Newman-Keuls
tests follow. Steps in the comparison of
means 1. First we order the sample means from
smallest to largest and make a table of
differences between all means as
follows (Smallest) (Largest) Id
(j) 4 5 2 3 1 Mean 36.7
40.3 43.4 47.2 48.7
14
CEP 933 Post hoc example
(Smallest) (Largest) Id (j) 4 5
2 3 1 Mean 36.7 40.3
43.4 47.2 48.7 Id
Mean 4 36.7 3.6 6.7 10.5
12.0 5 40.3 3.1 6.9
8.4 2 43.4 3.8
5.3 3 47.2 1.5 1 48.7 Note that each of
the differences in the table represents a
pairwise hypothesis about population means for
the groups in the study.
40.3 - 36.7 3.6
15
CEP 933 Post hoc example
Can you fill in the 5 hypotheses that are missing
from the table below? Table of tested
hypotheses (Smallest) (Largest)
mean mean Id (j) 4 5
2 3 1 Id 4 m4 m5 m4 m2 m4
m3 m4 m1 5 ?? ??
?? 2 ?? ?? 3 m3 m1 Note that
each of the differences in the table represents a
pairwise hypothesis about population means for
the groups in the study.
16
CEP 933 Post hoc example
2. Compute the critical mean difference for the
Tukey test
Critical mean difference qr ?
(MSW/ncell ) where the degrees of freedom for
qc(r, dfe) are r the number of groups and dfe
"df within" from the anova table. If the number
of cases differs across cells (or groups) then we
may want to use the harmonic mean. The formula
is k/(1/n1 1/n2 ... 1/nk)
where k is the number of groups and n1 through
nk are the group (cell) sample sizes. We would
use instead of ncell in the formula for
the critical mean difference.
17
CEP 933 Post hoc example
2. Compute the critical mean difference. Here we
do it for the Tukey test
We will use the .01
level of significance for this illustration.
Here qc(5,45) is about 4.9 (From table
2 of Appendix q) So the critical mean difference
qr ? (MSW/ncell ) Thus Crit mean diff 4.9
(28.8/10)1/2 8.33 (This is where the Tukey and
Newman-Keuls tests differ.)
18
CEP 933 Post hoc example
3. Compare each sample mean difference in the
table above to the critical mean difference of
8.33 calculated in 2. Id (j) 4 5
2 3 1 Mean 36.7 40.3
43.4 47.2 48.7 Id
Mean 4 36.7 3.6 6.7 10.5
12.0 5 40.3 3.1 6.9
8.4 2 43.4 3.8
5.3 3 47.2 1.5 When the sample mean
difference is larger than 8.33 we reject the
individual null hypothesis represented by the
pair of sample means. We reject three of the
hypotheses.
19
CEP 933 Post hoc example
3. The individual rejected null hypotheses are
shown here. Id (j) 4 5
2 3 1 Id 4 m4 m5 m4
m2 m4 m3 m4 m1 5 m5 m2 m5
m3 m5 m1 2 m2 m3 m2 m1
3 m3 m1 When a difference is not
significant we can draw a line under the two
means to represent the groups of equal
means. m4 m5 m2 m3 m1
-------------- --------------- The last
two sets fully --------------- overl
ap so they can be combined ----------
20
CEP 933 Post hoc example
4. Next we compute the critical mean differences
for the Newman-Keuls tests Again each critical
mean difference qr ? (MSW/ncell ) but the
degrees of freedom for qr(r, dfe) are r the
number of means in the "set" being compared (this
will vary from r k, the number of groups, to r
2), and dfe "df within" from the anova table.
There are always k-1 critical mean differences
using the N-K method. The first is the same as
for Tukeys test qr(5,45) 4.90 Crit
mean diff 4.90 ? (28.8/10 ) 8.33
21
CEP 933 Post hoc example
Here again is the first one qr(5,45) 4.90
Crit mean diff 4.90 ? (28.8/10 ) 8.33 The
others are always smaller qr(4,45) 4.68
Crit mean diff 4.68 ? (28.8/10 ) 7.96
qr(3,45) 4.35 Crit mean diff 4.35 ?
(28.8/10) 7.40 qr(2,45) 3.80 Crit
mean diff 3.80 ? (28.8/10) 6.46 So with N-K
you dont need to get as large a mean difference
to find a significant difference when the means
are closer together than k steps.

22
CEP 933 Post hoc example
5. Compare each sample mean difference in the
table above to the critical mean difference
calculated in step 4. Id (j) 4
5 2 3 1 Mean
36.7 40.3 43.4 47.2 48.7 Id
r CMD 4 3.6
6.7 10.5 12.0 5 8.33 5
3.1 6.9 8.4 4 7.96 2
3.8 5.3 3
7.40 3 1.5 2 6.46 When the
sample mean difference is larger than the CMD we
reject the null hypothesis represented by the
pair of sample means. This time the same H0 are
rejected, but sometimes N-K shows more
significant pairs than Tukey.
23
CEP 933 Post hoc example
Here is some SPSS output for Tukeys tests from
the school data set for teacher community
(tchcomm). ALL pairs of means are tested so there
is redundancy in this table.
24
CEP 933 Post hoc example
Also SPSS shows us which sets of means are equal
in a display of homogeneous subsets of means.
Here we can see that for tchcomm the
Homogeneous Subsets regions are all equal to
each other. If we had two subsets we would see
two columns in the table. The value of Sig
under each column is the p for a test of
equality among the means in the subset. These
should be large since the means in each group
dont differ.

25
CEP 933 Post hoc example
Here are Tukeys tests for teacher angst
(tchangst). This time we see some differences
among the means. Note the region that seems to
always be involved
26
CEP 933 Post hoc example
There are two homogeneous subsets of means for
the outcome tchangst a measure of teacher job
satisfaction. Here we can see that the South
region is different Homogeneous Subsets
from the rest, with a higher mean (3.49). The
NC, NE and West region means are equal all of
the means are around 3.20. The Sig values within
both columns are large, as we expect.

27
CEP 933 Post hoc example Scheffe tests
Scheffe tests Scheffe tests can be used to
examine all possible contrasts among the k means
from an anova. To use the Scheffe test we again
must create a contrast or combination of the
means. EXAMPLE Recall our means from
above. Group n
j Mean Interactive video 10 1 48.7 CAI 10 2
43.4 Standard video 10 3 47.2 Slide
tape 10 4 36.7 Lecture 10 5 40.3
28
CEP 933 Post hoc example Scheffe tests
We computed two contrasts before A
Interactive versus standard video 1 vs. 3 LA
1.5 and SE(LA) 2.4 so tA 1.5/2.4
0.62 B Video (I or S) versus all others 1
and 3 vs. rest LB 7.82 and SE(LB)
1.54 so tB 5.08 Before we used a regular t
critical value because we were treating the
contrasts as planned, but Scheffe tests are
post-hoc. So our critical value will differ. It
is larger to penalize us for looking at the data.
29
CEP 933 Post hoc example Scheffe tests
Scheffe Critical Values The critical value for
the Scheffe test is related to the critical F
value (Fc) from the anova. The Scheffe
critical value is either (k-1) Fc if we use an F
test ______________
_______ or tc ?(k-1) Fc(k-1, n-k) ? (k-1) Fc
if we use the contrast test that looks like a t
(i.e., L/SE(L).) We compare values of the
contrast tests described above to the Scheffe
critical values. Because (k-1) is part of these
formulas (as well as in the df of Fc), the
Scheffe CV increases as we have more groups.
30
CEP 933 Post hoc example Scheffe tests
Scheffe Critical Values ______________
_______ tc ?(k-1) Fc(k-1, n-k) ? (k-1) Fc
For our example the Scheffe t critical value
would be ?(4 x 2.58 ) 3.21. The Scheffe
critical value will always be bigger than the t
value you would use for a planned test of the
same hypothesis. This makes Scheffe less
powerful because we need a bigger mean difference
to get a significant result from Scheffe than
from a planned test.
31
CEP 933 Post hoc example Scheffe tests
Recall that tA 0.62 and tB 5.08 Also
the critical value for the Scheffe test is
______________ _______
________ t ?(k-1) F(k-1, n-k) ? (k-1) F
?(4 x 2.58 ) 3.21 We can see in this example
that the only contrast that is significant is B,
comparing the video to the rest. From the sample
means it appears that the video groups had, on
average, higher mean achievement than did the
other groups (though we did not test a
directional hypothesis in this example). The
hypothesis that we reject for this contrast is
H0 (m1 m3)/2 (m2 m4 m5)/3.
32
CEP 933 Post hoc example Scheffe tests
In SPSS we do not have the full flexibility shown
here for Scheffe tests if we use the pull-down
menus. Also Bonferroni tests are built into the
posthoc test section of the buttons on our ANOVA
windows. Here is SPSS output for the Scheffe
test - it is done on all pairs of tchangst means
(the factor is region).
33
CEP 933 Post hoc example Scheffe tests
One more item in the SPSS output for the Scheffe
test is the display of subsets of homogeneous
(equal) means. Homogeneous Subsets Again the
South mean is alone, and the means for the
Northcentral, the Northwest, and West regions
are all equal to each other. As for Tukeys
tests the Sig under each column is for a test
of equality among the means in the subset. These
should be large since the means in each group
dont differ.
Write a Comment
User Comments (0)
About PowerShow.com