Principal Components An Introduction - PowerPoint PPT Presentation

About This Presentation

Title:

Principal Components An Introduction

Description:

Principal Components An Introduction exploratory factoring meaning & application of principal components Basic steps in a PC analysis PC extraction process – PowerPoint PPT presentation

Number of Views:174

Avg rating:3.0/5.0

Slides: 24

Provided by: Gar68

Learn more at: https://psych.unl.edu

Category:

more less

Transcript and Presenter's Notes

Title: Principal Components An Introduction

1
Principal ComponentsAn Introduction

exploratory factoring
meaning application of principal components
Basic steps in a PC analysis
PC extraction process
PCs determination
Statistical approaches
Mathematical approaches
Nontrivial factors approaches

2
Exploratory vs. Confirmatory Factoring

Exploratory Factoring when we do not have RH
about . . .
the number of factors
what variables load on which factors
we will explore the factor structure of the
variables, consider multiple alternative
solutions, and arrive at a post hoc solution
Weak Confirmatory Factoring when we have RH
about the factors and factor memberships
we will test the proposed weak a priori factor
structure
Strong Confirmatory Factoring when we have RH
about relative strength of contribution to
factors by variables
we will test the proposed strong a priori
factor structure

3
Meaning of Principal Components

Component analyses are those that are based on
the full correlation matrix
1.00s in the diagonal
yep, theres other kinds, more later
Principal analyses are those for which each
successive factor...
accounts for maximum available variance
is orthogonal (uncorrelated, independent) with
all prior factors
full solution (as many factors as variables)
accounts for all the variance

4
Applications of PC analysis

Components analysis is a kind of data reduction
start with an inter-related set of measured
variables
identify a smaller set of composite variables
that can be constructed from the measured
variables and that carry as much of their
information as possible
A Full components solution ...
has as many PCs as variables
accounts for 100 of the variables variance
each variable has a final communality of 1.00
all of its variance is accounted for by the full
set of PCs
A Truncated components solution
has fewer PCs than variables
accounts for lt100 of the variables variance
each variable has a communality lt 1.00 -- not all
of its variance is accounted for by the PCs

5
The basic steps of a PC analysis

Compute the correlation matrix
Extract a full components solution
Determine the number of components to keep
total variance accounted for
variable communalities
Rotate the components and interpret (name)
them
Structure weights gt .3-.4 define which
variables load
Compute component scores
Apply components solution
theoretically -- understand meaning of the data
reduction
statistically -- use the component scores in
other analyses

interpretability
replicability

6
PC Factor Extraction

Extraction is the process of forming PCs as
linear combinations of the measured variables
PC1 b11X1 b21X2 bk1Xk
PC2 b12X1 b22X2
bk2Xk
PCf b1fX1 b2fX2 bkfXk
Heres the thing to remember
We usually perform factor analyses to find out
how many groups of related variables there are
however
The mathematical goal of extraction is to
reproduce the variables variance, efficiently

7
PC Factor Extraction, cont.

Consider R on the right
Obviously there are 2 kinds of information among
these 4 variables
X1 X2 X3 X4
Looks like the PCs should be formed as,

X1 X2 X3 X4 X1 1.0 X2 .7
1.0 X3 .3 .3 1.0 X4 .3 .3
.5 1.0

PC1 b11X1 b21X2 -- capturing the
information in X1 X2
PC2 b32X3 b42X4 -- capturing the
information in X3 X4
But remember, PC extraction isnt trying to
group variables it is trying to reproduce
variance
notice that there are cross correlations
between the groups of variables !!

8
PC Factor Extraction, cont.

So, because of the cross correlations, in order
to maximize the variance reproduced, PC1 will be
formed more like ...
PC1 .5X1 .5X2 .4X3 .4X4
Notice that all the variables contribute to
defining PC1
Notice the slightly higher loadings for X1 X2
Because PC1 didnt focus on the X1 X2 variable
group or X3 X4 variable group, there
will still be variance to account for in both,
and PC2 will be formed, probably something like
PC2 .3X1 .3X2 - .4X3 - .4X4
Notice that all the variables contribute to
defining PC2
Notice the slightly higher loadings for X3 X4

9
PC Factor Extraction, cont.

While this set of PCs will account for lots of
the variables variance -- it doesnt provide a
very satisfactory interpretation
PC1 has all 4 variables loading on it
PC2 has all 4 variables loading on it and 2 of
then have negative weights, even though all the
variables are positively correlated with each
other
The goal here was point out what extraction does
(maximize variance accounted for) and what it
doesnt do (find groups of variables)

10
Determining the Number of PCs

Determining the number of PCs is arguably the
most important decision in the analysis
rotation, interpretation and use of the PCs are
all influenced by the how may PCs are kept for
those processes
there are many different procedures available
none are guaranteed to work !!
probably the best approach to determining the
of PCS
remember that this is an exploratory factoring
-- that means you dont have decent RH about the
number of factors
So Explore
consider different reasonable PCs and try
them out
rotate, interpret /or tryout resulting factor
scores from each and then decide

To get started well use the SPSS standard of
? gt 1.00
11
Statistical Procedures

PC analyses are extracted from a correlation
matrix
PCs should only be extracted if there is
systematic covariation in the correlation
matrix
This is know as the sphericity question
Note the test asks if there the next PC should
be extracted
There are two different sphericity tests
Whether there is any systematic covariation in
the original R
Whether there is any systematic covariation left
in the partial R, after a given number of factors
has been extracted
Both tests are called Bartletts Sphericity Test

12
Statistical Procedures, cont.

Applying Bartletts Sphericity Tests
Retaining H0 means dont extract another
factor
Rejecting H0 means extract the next factor
Significance tests provide a p-value, and so a
known probability that the next factor is 1 too
many (a type I error)
Like all significance tests, these are influenced
by N
larger N more power more likely to reject H0
more likely to keep the next factor ( make a
Type I error)
Quandary?!?
Samples large enough to have a stable R are
likely to have excessive power and lead to
over factoring
Be sure to consider variance, replication
interpretability

13
Mathematical Procedures

The most commonly applied decision rule (and the
default in most stats packages -- chicken egg
?) is the ? gt 1.00 rule heres the logic
Part 1
Imagine a spherical R (of k variables)
each variable is independent and carries unique
information
so, each variable has 1/kth of the information in
R
For a normal R (of k variables)
each variable, on average, has 1/kth of the
information in R

14
Mathematical Procedure, cont.

Part 2
The trace of a matrix is the sum of its
diagonal
So, the trace of R (with 1s in the diag) k (
vars)
? tells the amount of variance in R accounted for
by each extracted PC
for a full PC solution ? ? k (accounts for all
variance)
Part 3
PC is about data reduction and parsimony
trading fewer more-complex things (PCs - linear
combinations of variables) for fewer more-simple
things (original variables)

15
Mathematical Procedure, cont.

Putting it all together (hold on tight !)
Any PC with ? gt 1.00 accounts for more variance
than the average variable in that R
That PC has parsimony -- the more complex
composite has more information than the average
variable
Any PC with ? lt 1.00 accounts for less variance
than the average variable in that R
That PC doesnt have parsimony -- the more
complex composite has more no information than
the average variable

16
Mathematical Procedure, cont.

There have been examinations the accuracy of this
criterion
The usual procedure is to generate a set of
variables from a known number of factors (vk
b1kPC1 bfkPCf, etc.) --- while varying N,
factors, PCs communalities
Then factor those variables and see if ? gt 1.00
leads to the correct number of factors
Results -- the rule works pretty well on the
average, which really means that it gets the
factors right some times, underestimates
sometimes and overestimates sometimes
No one has generated an accurate rule for
assessing when which of these occurs
But the rule is most accurate with k lt 40, f
between k/5 and k/3 and N gt 300

17
Nontrivial Factors Procedures

These common sense approaches became increasing
common as
the limitations of statistical and mathematical
procedures became better known
the distinction between exploratory and
confirmatory factoring developed and the crucial
role of successful exploring became better
known
These procedures are more like judgement calls
and require greater application of content
knowledge and persuasion, but are often the
basis of good factorings !!

18
Nontrivial factors Procedures, cont.

Scree -- the junk that piles up at the foot of
an glacier
a diminishing returns approach
plot the ? for each factor and look for the
elbow
Old rule -- factors elbow (1966 3 below)
New rule -- factors elbow - 1 (1967 2
below)

Sometimes there isnt a clear elbow -- try
another rule
This approach seems to work best when combined
with attention to interpretability !!

? 4 2 0
PC 1 2 3 4 5 6
19
An Example
A buddy in graduate school wanted to build a
measure of contemporary morality. He started
with the 10 Commandments and the 7 Deadly
Sins and created a 56-item scale with 8
subscales. His scree plot looked like How many
factors?
?
1? big elbow at 2, so 67 rule suggests a
single factor, which clearly accounts
for the biggest portion of variance 7? smaller
elbow at 8, so 67 rule suggests 7 8? smaller
elbow at 8, 66 rule gives the 8 he was looking
for also 8th has ? gt 1.0 and 9th had ? lt 1.0
0 1 10 20
1 8 20
40 56

Remember that these are subscales of a central
construct, so..
items will have substantial correlations both
within and between subscales
to maximize the variance accounted for, the
first factor is likely to pull in all these
inter-correlated variables, leading to a large ?
for the first (general) factor and much smaller
?s for subsequent factors
This is a common scree configuration when
factoring items from a multi-subscale scale!

20
Rotation finding groups in the variables

Factor Rotations
changing the viewing angle or head tilt of
the factor space
makes the groupings visible in the graph apparent
in the structure matrix

Unrotated Structure PC1 PC2 V1
.7 .5 V2 .6 .6 V3 .6 -.5 V4
.7 -.6
PC1
Rotated Structure PC1 PC2 V1 .7
-.1 V2 .7 .1 V3 .1 .5 V4 .2
.6
PC2
V2
V1
PC1
V3
V4
PC2
21
Interpretation Naming groups in the variables

Usually interpret factors using the rotated
solutions using the rotated
Factors are named for the variables correlated
with them
Usual cutoffs are /- .3 - .4
So a variable that shares at least 9-16 of
its variance with a factor is used to name that
factor
Variables may load on none, 1 or 2 factors

Rotated Structure PC1 PC2 V1 .7
-.1 V2 .7 .1 V3 .1 .5 V4 .2
.6
This rotated structure is easy PC1 is V1 V2
PC2 is V3 V4 It is seldom this easy !?!?!
22
Kinds of Factors

General Factor
all or almost all variables load
there is a dominant underlying theme among the
set of variables which can be represented with a
single composite variable
Group Factor
some subset of the variables load
there is an identifiable sub-theme in the
variables that must be represented with a
specific subset of the variables
smaller vs. larger group factors ( vars
variance)
Unique Factor
single variable loads

23
Kinds of Variables