Title: PCA on raw data
1PCA on raw data
Importance of components
PC1 PC2 PC3 PC4 PC5 PC6 PC7
PC8 Standard deviation 20.159 20.118 16.851
11.621 9.4234 8.8082 6.2834 6.0790 Proportion of
Variance 0.256 0.255 0.179 0.085 0.0559
0.0488 0.0249 0.0233 Cumulative Proportion
0.256 0.510 0.689 0.774 0.8301 0.8789 0.9037
0.9270
PC1 PC2
PC3 Ichthyomyzon.gagei 0.0017832346
-1.383224e-04 -1.268793e-03 Labidesthes.sicculus
-0.0028727706 -1.881688e-03 -2.055479e-03 Notem
igonus.chrysoleucas 0.0002350518 2.011322e-04
2.584251e-04 Nocomis.leptocephalus
-0.0404813269 2.005198e-02 1.965699e-01 Ericymba
.buccata -0.1276139906 -3.517594e-02
1.969177e-01 Pimephales.vigilax
-0.0016238194 -5.924597e-04 -6.249391e-04 Cyprinel
la.venusta -0.2071696573 -1.051431e-01
1.097405e-01 Luxilus.chrysocephalus
-0.2058364947 3.661838e-02 8.918647e-01
2PCA on raw data
3PCoA on raw data, Euclidean distance
4PCA, scaled, no rare species
Importance of components
PC1 PC2 PC3 PC4 PC5 PC6 PC7
PC8 Standard deviation 2.097 1.904 1.7647
1.4790 1.4664 1.3255 1.2615 1.2194 Proportion of
Variance 0.129 0.107 0.0916 0.0643 0.0632 0.0517
0.0468 0.0437 Cumulative Proportion 0.129 0.236
0.3276 0.3919 0.4551 0.5068 0.5536 0.5974
PC1 PC2 PC3
PC4 Ichthyomyzon.gagei -0.090135288
0.043747687 -0.002877967 0.01744758 Labidesthes.s
icculus 0.041107995 -0.038163610 0.124468292
-0.01789019 Nocomis.leptocephalus 0.108923218
-0.108325946 -0.450937632 0.02454138 Ericymba.buc
cata 0.255164516 -0.036053266
-0.218276882 0.04718731 Cyprinella.venusta
0.162116327 -0.058824879 -0.150635695
0.13533026 Luxilus.chrysocephalus 0.185008668
-0.212133298 -0.358435526 -0.02178550 Lythrurus.ro
seipinnis -0.135509539 -0.392224605
0.050916821 0.07785387 Notropis.longirostris
0.257018040 -0.060612841 0.087004613 -0.03929462
5PCA biplot, scaled, no rare species
6PCA, scaled, no rare species, log transformed
PC1 PC2 PC3 PC4 PC5 PC6 PC7
PC8 Standard deviation 2.080 1.778 1.607
1.223 1.1031 0.9896 0.9001 0.7607 Proportion of
Variance 0.226 0.165 0.135 0.078 0.0635 0.0511
0.0423 0.0302 Cumulative Proportion 0.226 0.391
0.525 0.603 0.6667 0.7177 0.7600 0.7902
PC1 PC2 PC3
PC4 Ichthyomyzon.gagei 0.0165784997
-0.005150938 0.0055671363 -0.002121706 Labidesthe
s.sicculus -0.0236522313 -0.011409750
0.0001065219 0.009332682 Nocomis.leptocephalus
-0.0546988734 -0.084951957 -0.2062884655
-0.201899456 Ericymba.buccata
-0.2946789538 0.031038552 -0.2694025305
-0.167515049 Cyprinella.venusta
-0.3610402907 -0.069026127 0.0066823120
0.306635579 Luxilus.chrysocephalus -0.2598456140
-0.410318820 -0.3358348999 -0.003684535 Lythrurus.
roseipinnis -0.0983100419 -0.560316356
0.2872093354 -0.065075897
7PCA, scaled, no rare species, log transformed
8(No Transcript)
9Non Metric Multidimensional Scaling
- Most robust and common unconstrained ordination
- Distance based
- Similar to metric multidimensional scaling
(PCoA), rank distances used - Usually called NMDS, sometimes called MDS
- Goal of analysis place samples in k dimensional
space to minimize differences between rank
similarities in the distance matrix and rank
euclidean similarities in ordination space.
10Non Metric Multidimensional Scaling
- Stress measure of the lack of fit between rank
order dissimilarities and rank order euclidean
distance in ordination space. - Usually expressed as a scaled percentage.
- 0perfect fit, higher worse fit
11Shepard (stress) Plot
Multidimensional Shepard Plot
First 2 axes of the same solution
12Non Metric Multidimensional Scaling
- Species information lost
- No variation accounted for by axes, stress is
an analog - Not necessarily any order of importance to axes
- Points and axes can be fully rotated, all that
matters is the relative position of points - Must specify number of axes ahead of time
- Stress is reduced as more axes are used
- 1st dimension of a 2D NMDS is not the same thing
as the 1st dimension of a 6D NMDS
13(No Transcript)
14Non Metric Multidimensional Scaling
- Iterative algorithm, computationally intensive
- Start with initial configuration, move points to
reduce stress - Susceptible to getting stuck in local optima
stress
Multidimensional Space
15Multiple Starting Points
- Use multiple starting points to reduce local
optima problem - Starting point options
- Multiple random starting points
- Perform other ordination first to get a starting
configuration
stress
Multidimensional Space
16Convergence and Procrustes Analysis
- NMDS converges on a configuration that minimizes
stress, additional iterations do not improve
stress - Some approaches use Procrustean analysis to
assess differences in configurations for each
iteration
Iteration 2
Iteration 1
17NMDS Code
- Plain NMDS
- Code (MASS package)
- distancelt-vegdist(community, method"bray")
- nmdslt-isoMDS(distance, k2)
- nmds
- plot(nmdspoints)
- Options
- K number of axes
- Tol convergence tolerance
- Maxit maximum number of iterations
- Can specify a starting configuration, otherwise
will perform a PCoA to obtain a starting position
18NMDS Code
- metaMDS
- Performs multiple NMDS with multiple different
starting positions - Uses Procrustes to track convergence
- Code
- metanmdslt-metaMDS(community,k4,distance"bray")
- Options
- Plot plot Procrustes errors along the way
- K number of axes
- Distance distance measure to use on raw data
- Autotransform use some automatic
transformations if the analysis thinks they are
necessary - Expand get species scores as weighted averages
- Noshare alter similarity if a certain
proportion of samples have no species in common - Various other options to center/rotate scores
(?metaMDS for details)
19NMDS Code
- Output
- isoMDS
- Scores for each sample, stress values for the
final solution and each iteration - metaMDS
- Sample scores, species scores (weighted
averages), Procrustes errors, Procrustes plots,
final stress, stress at each iteration.
20NMDS Example
21NMDS Example
initial value 14.250589 iter 5 value
9.263973 iter 10 value 7.920977 iter 15 value
7.734398 iter 15 value 7.726740 final value
7.701990 converged gt nmdslt-isoMDS(distance,k3) i
nitial value 7.999925 iter 5 value
4.118777 iter 10 value 3.973427 iter 15 value
3.870255 final value 3.822107 converged gt
nmdslt-isoMDS(distance,k5) initial value
1.595118 iter 5 value 1.371944 iter 10 value
1.210396 iter 15 value 1.137483 iter 20 value
1.060814 iter 25 value 1.026373 iter 30 value
1.019002 iter 30 value 1.018266 iter 35 value
1.004853 iter 35 value 1.003870 iter 35 value
1.003204 final value 1.003204 converged gt
nmdslt-isoMDS(distance,k10) initial value
1.110138 iter 5 value 0.765316 iter 10 value
0.664021 iter 15 value 0.610360 iter 20 value
0.552149 iter 25 value 0.490166 iter 30 value
0.453501 iter 35 value 0.408528 iter 40 value
0.366280 iter 45 value 0.340277 iter 50 value
0.324808 final value 0.324808 stopped after 50
iterations gt nmdslt-isoMDS(distance,k20) Error in
isoMDS(distance, k 20) initial
configuration must be complete In addition
Warning messages 1 In cmdscale(d, k) some of
the first 20 eigenvalues are lt 0
- Stress ranges from 7.7 with k2 to 0.3 with k10
- Undefined at K20
Note isoMDS uses an unchanging stress value to
indicate convergence on a solution. Change the
tolerance (tol) value to adjust what is
considered converged.
22Stress Plots
K10
K2
K10, first 2 axes
If you want a good 2D representation, k2 is
better than k10 even though the stress will be
higher.
23Procrustes
24No indication of percent variation accounted for
on the two axes. However, the first axis very
nicely captures the pattern in the raw data.