Interpreting Principal Components - PowerPoint PPT Presentation

About This Presentation
Title:

Interpreting Principal Components

Description:

Interpreting Principal Components Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 22
Provided by: DrSi63
Learn more at: https://iri.columbia.edu
Category:

less

Transcript and Presenter's Notes

Title: Interpreting Principal Components


1
Interpreting Principal Components
  • Simon Mason
  • International Research Institute for Climate
    Prediction
  • The Earth Institute of Columbia University

L i n k i n g S c i e n c e t o S o c
i e t y
2
Retaining Principal Components Principal
components analysis is specifically designed as a
data reduction technique. How many of the new
variables should be retained to represent the
total variability of the original variables
adequately? A stopping rule is required to
identify at which point additional principal
components are no longer required.
L i n k i n g S c i e n c e t o S p o
r t !
3
Retaining Principal Components There is a range
of criteria that could be used to formulate a
stopping rule Internal criteria 1. Total
variance explained 2. Marginal variance
explained 3. Comparison with other
deleted/retained eigenvalues External
criteria 4. Usefulness 5. Physical
interpretability.
L i n k i n g S c i e n c e t o S p o
r t !
4
Retaining Principal Components
Total variance explained Ensures a minimum
loss of information, but No a priori criteria for
defining the proportion of signal.
L i n k i n g S c i e n c e t o S p o
r t !
5
Retaining Principal Components
Marginal variance explained Ensures that each
component explains a substantial proportion of
the total variance. Choice of c?
L i n k i n g S c i e n c e t o S p o
r t !
6
Retaining Principal Components
Marginal variance explained 1. Original
variables For the correlation matrix, the
Guttmann - Kaiser criterion sets c 1. For the
covariance matrix, Kaisers rule sets c to the
average of the original variables
L i n k i n g S c i e n c e t o S p o
r t !
7
Retaining Principal Components
Marginal variance explained 2. Significant a.
The broken stick rule b. Rule N Randomization
procedures.
L i n k i n g S c i e n c e t o S p o
r t !
8
Retaining Principal Components
Similar variance explained Delete if components
with similar variance are deleted. 1. ?2
approximations 2. Scree test Delete eigenvalues
below the elbow.
L i n k i n g S c i e n c e t o S p o
r t !
9
Retaining Principal Components
Similar variance explained 3. Log-eigenvalue
test Scree test using logarithms of
eigenvalues. Based on the assumption that the
eigenvalues should decline exponentially.
L i n k i n g S c i e n c e t o S p o
r t !
10
Retaining Principal Components
Usefulness If principal components are to be
used in other applications, retain the number
that gives the best results. Use
cross-validation. Perhaps retain subsets that do
not necessarily include the first few
components. Possibly subject to sampling errors,
especially subset selection.
L i n k i n g S c i e n c e t o S p o
r t !
11
Retaining Principal Components
Physical interpretability 1. Time scores Do the
time scores differ from white noise? 2. Spatial
loadings Loadings identify modes of variability.
L i n k i n g S c i e n c e t o S p o
r t !
12
Interpreting the Principal Components Principal
components are notoriously difficult to interpret
physically. The weights are defined to maximize
the variance, not maximize the interpretability!
With spatial data (including climate data) the
interpretation becomes even more difficult
because there are geometric controls on the
correlations between the data points.
L i n k i n g S c i e n c e t o S p o
r t !
13
Buell patterns Imagine a rectangular domain in
which all the points are strongly correlated with
their neighbours.
L i n k i n g S c i e n c e t o S p o
r t !
14
Buell patterns The points in the middle of the
domain will have the strongest average
correlations with all other points, simply
because their average distance to all other grids
is a minimum.
The strong correlations between neighbouring
grids will be represented by PC 1, with the
central grids dominating.
L i n k i n g S c i e n c e t o S p o
r t !
15
Buell patterns The points in the corners of the
domain will have the weakest average correlations
with all other points, simply because their
average distance to all other grids is a maximum.
The weak correlations between distant grids will
be represented by PC 2. The direction of the
dipole reflects the domain shape.
L i n k i n g S c i e n c e t o S p o
r t !
16
Buell patterns? Are these real, or are they a
function of the domain shape?
L i n k i n g S c i e n c e t o S p o
r t !
17
  • Buell patterns
  • Because of domain shape dependency
  • the first PC frequently indicates positive
    loadings with strongest values in the centre of
    the domain
  • the second PC frequently indicates negative
    loadings on one side and positive loadings on the
    other side in the direction of the longest
    dimension of the domain.
  • Similar kinds of problems arise when using
  • gridded data with converging longitudes, or
    simply with longitude spacing different from
    latitude spacing
  • station data.

L i n k i n g S c i e n c e t o S p o
r t !
18
Rotation The principal component weights are
defined to maximize the variance, not maximize
the interpretability! The weights could be
redefined to meet alternative criteria. Rotation
is sometimes performed to maximize the weights of
as many metrics as possible, and to minimize the
weights of the others. An objective of rotation
is to attain simple structure 1. weights are
either close to zero or close to one 2.
variables have high weights on only one component.
L i n k i n g S c i e n c e t o S p o
r t !
19
Rotation The principal component weights are
defined to maximize the variance, not maximize
the interpretability! The weights could be
redefined to meet alternative criteria. Rotation
is sometimes performed to maximize the weights of
as many metrics as possible, and to minimize the
weights of the others. An objective of rotation
is to attain simple structure 1. weights are
either close to zero or close to one 2.
variables have high weights on only one component.
L i n k i n g S c i e n c e t o S p o
r t !
20
Rotation
  • Commonly used rotation procedures include
  • Varimax maximises the variance of the squared
    loadings.
  • Quartimin oblique rotation
  • Procrustes maximises the similarity between one
    set of loadings and a target set. Can be
    orthogonal or oblique.

L i n k i n g S c i e n c e t o S p o
r t !
21
Rotation Rotation does NOT solve Buell pattern
problems, nor station and uneven gridded data
problems, it only reduces them. What if a mode
does not have simple structure for example, a
general warming trend? These problems are only
of concern for interpretation. Rotation may be
redundant if the principal components are used as
input into some other procedures.
L i n k i n g S c i e n c e t o S p o
r t !
Write a Comment
User Comments (0)
About PowerShow.com