Title: Decisionmaking based on hypothesis testing:
1- Decision-making based on hypothesis testing
- 1-sample t-test of means
- 2-sample t-test of means
- Paired t-test of means
- Analysis of Variance (ANOVA)
-
19 February 2009
2Analysis of Variance (ANOVA) general info
- Comparing the means and variance of multiple
samples - gt2 samples
- ANOVA is a class of sampling or experimental
designs in - which the predictor variable is categorical and
the response - variable is continuous
Example using lobster data Response variable
Number of lobster per trap Predictor
variable locations (reserves, edges, far)
31-way ANOVA the scenario
Scenario 4
- Fishermen know that catches vary with space and
time, even within - prime lobster habitat (rocky reefs), even over
small distances.
- They want to know the effects of reserves on the
number () of lobsters per trap
- They want to know whether catches change with
- distance from reserves
- They also want to see empirical evidence of spill
over
- Question Is there a difference in the
lobster - per trap inside vs. on the edge of
vs. far away - from reserves in 2003?
41-way ANOVA sampling design at Gull Island
Reserve (2003)
Kinton Pt (Far)
Santa Cruz Island
Morse Beach (Near)
Morse Pt (Inside)
5Not correct design this is design for a
randomized block design (discussed later!)
Carrington Reserve
Scorpion Reserve
Anacapa Reserve
Harris Pt Reserve
Gull Reserve
Skunk Pt Reserve
Compare 2003 data from inside vs. edge vs.
outside of reserves
61-way ANOVA correct sampling design
randomly distribute (place) treatment replicates
Carrington Reserve
Scorpion Reserve
Anacapa Reserve
Harris Pt Reserve
Gull Reserve
Skunk Pt Reserve
Compare 2003 data from inside vs. edge vs.
outside of reserves
71-way ANOVA the hypothesis
2 steps to take
1. 1-way ANOVA test for overall differences
among levels of treatment
2. Conduct a post-hoc test to determine which
level is greater than the others
8Tabling the data
9Plotting the data graph
10A 1-way ANOVA table (i.e., results)
Source DF Sum of Squares
Mean Square F-ratio P-value
Locations
2
65.33
32.67
4.88
0.008
Residual
15
73.17
4.88
17
138.50
(19.78)
Total
- Significant difference among groups (reserves,
edge, far) - So treatment effect, locations, has a
significant effect on lobster/trap
11Analysis of Variance (ANOVA) background
1-way ANOVA
Randomized block
Multi-factor ANOVA (2-way, 3-way, etc)
Split plot
- Use dependent and independent variables
(ordinate)
Y a ßX
(abscissa)
12Basic hypotheses for ANOVA Regression
Basic statistical hypothesis (the null
hypothesis, Ho) variation in the response
(dependent) variable is unrelated to variation
in the predictor (independent) variable and is
no greater than expected by chance or sampling
error.
Alternative hypothesis (HA) chance cannot
entirely account for this variation, and that
at least some of this variation can be attributed
to the predictor variable (how much in ANOVA
partial R2 value!)
13Terminology for ANOVA
- Treatments different categories (groups) of
predictor variables
In experiment, represent the manipulations that
have been performed
- Number of treatments number of categories
(groups) being compared
- Treatment level each value in treatment
(reserves, edge, far)
- 1-way ANOVA designs 1 treatment (location)
- single-factor
- Replicates multiple observations made within
each treatment
- Each replicate should be independent
biologically, economically, - chemically, etc.
141-way ANOVA advantages disadvantages
Advantages (1) One of the most simple but
powerful designs (2) Can use unequal number
of replicates in treatments (3) Allows you to
test differences among treatments, as
well as more specific hypotheses about which
particular group means are
different and which are similar
Disadvantages (1) Does not explicitly
accommodate environmental (or other)
heterogeneity. Complete randomization
of the replicates within each treatment
level implies that they will sample the
entire array of background conditions, all
of which may affect the response
variable a. On the one-hand this is a good
thing b/c it means that the results of the
experiment can be generalized across all of the
environments b. On the other hand, if the
environmental noise is much stronger than the
signal of the treatment, the experiment will
have low power. The analysis may not reveal
treatment differences unless there are many
replicates.
151-way ANOVA advantages disadvantages
(continued)
Disadvantages (2) Organizes the treatment groups
along a single factor a. If the
treatments represent different kinds of factors,
then a 2-way design (layout) should be
used to tease apart main effects
and interactions (to be discussed in 2-way ANOVA
lecture!) Example
lobster per trap influenced by type of locations
and amount of reef habitat within a
location.
16All ANOVAs the basic math
ANOVA is based on calculating the Sum of Squares
(never known!)
sample mean
- Sum of Squares
- Total variation in a set of data can
- be expressed as a sum of squares
- (2) Squared deviation of each observation
- from the mean of the observations
17All ANOVAs basic math (continued)
- Framework for ANOVA Partitioning of the Sum
of Squares
- If assumptions of ANOVA met
- Can use Fishers F-test name coined by G.W.
Snedecor. - Any analysis that uses an F-distribution.
Compares F-table - value to an
- to estimate P-values for the
- partitioned Sum of Squares
18All ANOVAs basic math (continued)
What this means
Total variation, SSy, can be partitioned into
different components. Some components represent
random or sampling error variation that is not
attributable to any specific cause such
observation or counting errors, or other
unspecified forces of nature. Other components
represent the effects of experimental treatments
applied to the replicates, or the differences
among sampling categories The procedure (1)
specify an underlying model for how the
observations might be affected by different
treatments (2) partitioning the Sum of Squares
among different components of the model (3) and
then using results to test statistically hypothese
s for the strength of particular effects.
19So lets do an example with lobster data!
1-way ANOVA
- 3 groups (inside reserves, edge of reserves, far
from reserves)
- 6 replicates per treatment (3 x 6 18 total
observations)
- Step 1 calculate Total Sum of Squares
n
a
SStotal S S
(Yi - )2
j
i 1
j 1
i 1 to a treatment levels
j 1 to n replicates per treatment level
- SStotal deviation of each observation from the
grand mean
20So lets do an example with lobster data!
(continued)
1-way ANOVA
- SStotal is then decomposed (partitioned) into
two different - sources (components of variation)
- Variation Among Groups represents differences
among - means (averages) of each of the treatment
levels
n
a
SSamong groups S S
(Yi - )2
i 1
j 1
(2) Variation Within Groups represents the
derivation of each observation from its own
group mean and sums across all the groups
and replicates
n
a
SSwithin groups S S
(Yij - i )2
i 1
j 1
21So lets do an example with lobster data!
(continued)
1-way ANOVA
n
a
SSwithin groups S S
(Yij - i )2
i 1
j 1
called the Residual Sum of Squares, or
residual variation, or error variation
variation not explained by experimental or
controlled (independent) variables
SStotal SSamong groups SSwithin groups
138.50 73.17 65.33
Were going to use these SSs to compute a
F-ratio which we compare with an F-value from the
an F-Table..But first..Assumptions of the test!
22Assumptions of ANOVA tests
Must meet these
- 1. Samples are independent same for all
statistical models - Observations within and between groups are
independent of - one another
- 2. Variances are homogeneous between groups
- Sample means may vary but we assume that the
variances - within each group is about the equal to the
variance within - all other groups. Then each treatment group
(level) contributes - roughly the same to the within-group sum of
squares. - If not homogeneous, you can transform data so
that is - homogeneous (Log, SQRT, etc).
- Before and after you transform it, test for
homogeneity of variances - with Cochrans test.
23Assumptions of ANOVA tests (continued)
2. Variances are homogeneous between groups
(continued)
- Cochrans test uses Cochrans C statistic
24Assumptions of ANOVA tests (continued)
3. Residuals are randomly distributed
251-way ANOVA hypothesis
Ho Yij µ eij
HA Yij µ Ai eij
261-way ANOVA correct sampling design
randomly distribute (place) treatment replicates
Carrington Reserve
Scorpion Reserve
Anacapa Reserve
Harris Pt Reserve
Gull Reserve
Skunk Pt Reserve
Compare 2003 data from inside vs. edge vs.
outside of reserves
27A 1-way ANOVA data
See excel sheet on web Lobster_data_4
28A 1-way ANOVA table
Source DF Sum of Squares
Mean Square F-ratio P-value
Among groups
a-1
SSamong
SSamong
F-ratio here vs. F-ratio in F-table If bigger
than table Significantly different
(a-1)
Within groups (residual) (error)
a(n-1)
SSwithin
SSwithin
a(n-1)
Total
an-1
SStotal
SStotal
(an-1)
29A 1-way ANOVA table (i.e., results)
With lobster data from Excel sheet
Lobster_data_4
Source DF Sum of Squares
Mean Square F-ratio P-value
Locations
2
65.33
32.67
4.88
0.008
Residual
15
73.17
4.88
17
138.50
(19.78)
Total
- Significant difference among groups (reserves,
edge, far) - So treatment effect, locations, has a
significant effect on lobster/trap