Title: Multivariate Statistics for the Environmental Sciences
1Multivariate Statistics for the Environmental
Sciences
- Peter J. A. Shaw
- Chapter 1
- Introduction
21.1 What is meant by multivariate stats?
- What do stats do for us?
- Descriptive Stats
- Inferential Stats
3- Univariate
- Multivariate
- Multivariate statistics tell you what you
already know, but couldnt quite put your finger
on - Michael Usher
4(No Transcript)
51.1.1 Why use multivariate stats?
- Save time by reducing analytical work
- Reduces the danger of misinterpreting random
noise - Can be used to explore and describe data sets
with many variables - Allows for the generation of a hypothesis
- Suggests patterns to be found with relatively
little work
61.2 Scope of the Book
- Explain application of multivariate techniques
- Will not focus on data collection
- 5 approaches that will be discussed diversity
indices, multiple regression, ordination, cluster
analysis and canonical correspondence analysis - Chapter structure
- Introduce the technique
- Apply the technique to small, model datasets to
explain procedures - Show how multivariate stats contributes to
environmental sciences
71.3 When to use multivariate stats
- Dont use when there is just one specific
dependent variable responding to one defined
factor - Refer to page 7 of Shaw for a list of common
environmental research situations in which
multivariate stats should/can be used
81.4 Computing Requirements
- Most of the multivariate techniques require the
use of a computer, except diversity indices and
Bray-Curtis ordination
91.5 Preparing the data Points to consider
- Types of data to collect
- Avoiding pseudoreplication
- Organization of the data matrix
- Preliminary inspection of the data
101.5.1 Types of data to collect
- Four different types of data, in order of
increasing information content - Nominal data
- Ordinal data
- Continuous data
- Interval data
- Ratio data
111.5.2 Avoiding pseudoreplication
- Defined by Hurlbert (1984) as
- the use of inferential statistics to test for
treatment effects with data from experiments
where either treatments are not replicated
(though samples may be) or replicates are not
statistically independent. - Essentially, it is organizing the data in such a
way that it appears more independent observations
have been made than are actually the case.
12(No Transcript)
13(No Transcript)
141.5.3 Organization of the data matrix
- Remember, computers are stupid, therefore you
must pick up the slack! Properly format your
data - Store data in a matrix (rectangular array of
data) - Metadata
15(No Transcript)
16(No Transcript)
17Points to consider in data organization
- Column organization
- Classification variables
- Metadata
- Missing values
- Exclude the variable(s)
- Exclude the observation(s)
18Points to consider in data organization cont.
- Format errors
- Format data so it is compatible with the
requirements of the software packages used for
analysis - Recoding data
- Only do inside the statistical package once the
data matrix has been assembled - Collinear
191.5.4 Preliminary inspection of the data
- GIGO Garbage In Garbage Out
- If the matrix data is odd or incorrectly entered,
it will likely affect the analysis
20(No Transcript)
21- Inspect data for outliers before analysis
- Normal distribution
- Multivariate normal distribution
221.6.2 Life in Alaskan Streams sample data set
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)