Two Factor ANOVA and the BACI sampling design - PowerPoint PPT Presentation

1 / 32

About This Presentation

Title:

Two Factor ANOVA and the BACI sampling design

Description:

FTV(3,12) = 3.49. Accept H0. H0: No Int'n between factors. Hi: There is int'n between factors. FTV(6,12) = 3.00. Accept H0. 14. An issue to think about: ... – PowerPoint PPT presentation

Number of Views:535

Avg rating:3.0/5.0

Slides: 33

Provided by: llye

Category:

more less

Transcript and Presenter's Notes

Title: Two Factor ANOVA and the BACI sampling design

1

Two Factor ANOVA and the BACI sampling design
Non-parametric two-factor tests
Resampling method for two-factor tests.

2
Two Factor Designs

Consider studying the impact of two factors on
the yield (response)
Note The 1 and 2, etc, mean Level 1, Level
2, etc.., NOT metric values.
Here we have R 3 rows (levels of the Row
factor), C 4 (levels of the column factor), and
n 2 replicates per cell nu for cell if not
all equal

3
Model

i 1,, R
j 1,, C
k 1,, n
In general, n observations per cell, R? C cells

Where

ALL the terms are somewhat intuitive, except
for
The term is more
intuitively written as

Adjustment for row membership
Adjustment for column membership
How a cell differs from grand mean
6

We can, without loss of generality, assume (for a
moment) that there is no error why then might
the above equation be non-zero?
Answer INTERACTION
Two basic ways to look at interaction

1)
If AHBH 13, no interaction If AHBH 13,
interaction If AHBH - When B goes from BL?BH, yield goes up by 3 (5
?8). - When A goes from AL?AH, yield goes up by 5
(5 ?10). - When both changes of level occur, does
yield go up by the
sum of 3 and 5?
7

Interaction degree of difference from sum of
separate effects
Holding BL, what happens as A goes from AL?AH?
5
Holding BH, what happens as A goes from AL?AH?
9
Is the effect of one factor (i.e., the impact of
changing its level) is DIFFERENT for all levels
of another factor, then INTERACTION exists
between the two factors

2)
NOTE - Holding AL, BL ? BH has impact 3
- Holding AH, BL ? BH has impact 7
(AB) (BA) or (9-5) (7-3)
8
Means in a 2-factor ANOVA, with various effects
of the factors and the interaction.

a) No effect of factor A, small effect of factor
B.
b) Large effect of factor A, small effect of
factor B, and no interaction
c) No effect of factor A, small effect of factor
B, and no interaction
d) Large effect of factor A, large effect of B,
and no interaction

9
(e)
(f)

B2
X
X

B2

B1
B1
A1
A2
A1
A2
B1
(h)
(g)
B1
X
X

B2
B2
A1
A2
A1
A2

e) No effect of A, no effect of B, but
interaction between A and B
f) Large effect of A, but no effect of B, with
slight interaction
g) No effect of A, large effect of B, with large
interaction
h) Effect of A, effect of B, with large
interaction

Going back to the (model) equation bringing
to the other side of the equation, we get
if we then square both sides, triple sum both
sides over i, j, and k, we get, (after noting
that all cross-product terms cancel)

Or,
And in terms of degrees of freedom,
In our example

12
(No Transcript)
13

ANOVA
H0 All Row Means Equal Hi Not all Row Means
Equal
FTV(2,12) 3.88 Reject H0
H0 All Col. Means Equal Hi Not all Col.
Means Equal
FTV(3,12) 3.49 Accept H0
H0 No Intn between factors Hi There is
intn between factors
FTV(6,12) 3.00 Accept H0
14

An issue to think about
Since Vintn cannot be negative, and
MSI1.83strong evidence that Vintn is not 0.
If this is true, E(MSI) ??2, and we should
combine MSI and MSW (i.e.. pool) estimates.
This gives

We have
15
Another Issue

The table of 4 pages ago assumes what is called a
Fixed Model. There is also what is called a
Random Model (and a Mixed Model).

Column fixed row random
16

Fixed
Random
Fixed
Random
Fixed
Random

Specific levels chosen by the experimenter
Levels chosen randomly from a large number of
possibilities.
All levels about which inferences are to be made
are included in the experiment.
Levels are some of a large number possible.
A definite number of qualitatively
distinguishable levels, and we plan to study them
all. Or a continuous set of quantitative
settings, but we choose a suitable, definite
subset in a limited region and confine inferences
to that subset.
Levels are a random sample from an infinite
population
17

In a great number of cased the investigator may
argue either way, depending on his mood and his
handling of the subject matter. In other words,
it is more a matter of assumption than of
reality.
Some authors say that if in doubt, assume fixed
model. Others say things like I think in most
experimental situations the random model is
applicable. The latter quote is from a person
whose experiments are in the field of biology.

18
Two Factors with No Replication, No Interaction

When theres no replication, there is no pure
way to estimate ERROR.
Error is measured by considering more than one
observation (i.e. replication) at the same
treatment combination (i.e. experimental
conditions).

Our model for analysis is technically
We can write
After bringing to the other side of the
equation, squaring both sides, and double summing
over i and j, We find

Degrees of freedom
We know,
If we assume
and we can call

And our may be rewritten
and the labels would become
in our problem

And
What if were wrong about there being no
interaction?

ANOVA
At ?.01
FTV(3,6) 9.78
FTV(2,6) 10.93
TSS 62 11
23

If we think our ratio is, in Expectation,
(Say,
for ROWS)
and it really is (because theres interaction)
being wrong can lead only to giving us an
underestimated Fcalc.
Thus, if weve REJECTED Ho, we can feel confident
of our conclusion, even if theres interaction.
If weve ACCEPTED Ho, only then could the no
interaction assumption be CRITICAL.

24
Non-parametric 2 Factor ANOVA with replications

If assumptions of normality and constant variance
are not met by the data, rank the data, then use
the usual parametric ANOVA on the ranked data.
Using ranks is more robust than finding a
transformation that works.
If there are no interaction, you can use the
2-factor with replication procedure given in
Conover (1980).

25
Non-parametric alternative to 2-Factor ANOVA
without replication Friedmans Test

Example TSS at 9 sites during 4 seasons.

H0 MAMBMCMD HA Not all medians TSS equal
during the 4 seasons
26
Convert to ranks within each row
27

Test Statistic
For the given problem, R 9, C 4,
Under the null hypothesis, FR may be approximated
by a Chi-Square distribution with (C-1)d.f.
For our problem with 3 d.f., critical value of
Chi-Square distribution at ? 5, 7.815.
Since 20.037.815, we reject the null hypothesis
and conclude that there are differences among the
seasons with respect to TSS.

28
Stratified Shuffling

Shuffling or randomization in its simplest form
is used to test the generic null hypothesis that
one variable (or groups of variables) is
unrelated to another variable (or groups of
variables).
Significance is assessed by shuffling one
variable (or set of variables) relative to
another variable (or sets of variables).
Shuffling ensures that there is in fact no
relationship between the variables.
If the variables are related, then the original
unshuffled data should be unusual relative to the
values of the test statistic shuffling because of
the presence of the blocking factor. Hence each
block must be considered as a strata (or block).

Consider the case of 2 blocks (nests) and 3
Treatments (Distance) with R.V. (changes in
exposure times in seconds) as given below

To test whether distance has an effect, we can
use the test statistic given by the pairwise sum
of squared differences of the mean exposure times
at each distance. That is
The observed SSD for the example above is

SSD (mean _at_ 0.75 - mean _at_ 1.25)2 (mean _at_ 0.75
- mean _at_ 2.5)2 (mean _at_ 1.25 - mean _at_ 2.5)2
31

The question now is how likely is it that the
observed value of 12.48 is equaled or exceeded by
chance alone I.e. if in fact there is no distance
effect? If the probability is very low (less
than 0.05, we say that it is unlikely that there
is no distance effect. Hence the hypothesis of
no distance effect is rejected.
If there is no distance effect, we can in fact
combine the data for each strata (nest) and
shuffle them.
For example, for nest 1, the combined data are
2, -1, 6, 5, 7, 0, and 8. If we shuffle them
once (randomly rearrange them), we may get 0, 7,
6, 2, 8, -1, 5. Hence, the values 0, 7, 6 could
have been at the 0.75nm distance, 2 and 8 could
have been at the 1.25nm distance, and -1, and 5
could have been at the 2.5nm distance.

Similarly, we do the same for the nest 2 data.
After one cycle of shuffling, we would get one
value of SSD. Repeat say 10,000 times, we will
get 10,000 values of SSD giving us a sampling
distribution of SSD.
An estimate of the p-value is obtained by
counting the proportion of SSDs greater or equal
to the observed SSD.

Write a Comment

User Comments (0)