Principal Components An Introduction - PowerPoint PPT Presentation

About This Presentation
Title:

Principal Components An Introduction

Description:

Principal Components An Introduction exploratory factoring meaning & application of principal components Basic steps in a PC analysis PC extraction process – PowerPoint PPT presentation

Number of Views:172
Avg rating:3.0/5.0
Slides: 24
Provided by: Gar68
Learn more at: https://psych.unl.edu
Category:

less

Transcript and Presenter's Notes

Title: Principal Components An Introduction


1
Principal ComponentsAn Introduction
  • exploratory factoring
  • meaning application of principal components
  • Basic steps in a PC analysis
  • PC extraction process
  • PCs determination
  • Statistical approaches
  • Mathematical approaches
  • Nontrivial factors approaches

2
Exploratory vs. Confirmatory Factoring
  • Exploratory Factoring when we do not have RH
    about . . .
  • the number of factors
  • what variables load on which factors
  • we will explore the factor structure of the
    variables, consider multiple alternative
    solutions, and arrive at a post hoc solution
  • Weak Confirmatory Factoring when we have RH
    about the factors and factor memberships
  • we will test the proposed weak a priori factor
    structure
  • Strong Confirmatory Factoring when we have RH
    about relative strength of contribution to
    factors by variables
  • we will test the proposed strong a priori
    factor structure

3
Meaning of Principal Components
  • Component analyses are those that are based on
    the full correlation matrix
  • 1.00s in the diagonal
  • yep, theres other kinds, more later
  • Principal analyses are those for which each
    successive factor...
  • accounts for maximum available variance
  • is orthogonal (uncorrelated, independent) with
    all prior factors
  • full solution (as many factors as variables)
    accounts for all the variance

4
Applications of PC analysis
  • Components analysis is a kind of data reduction
  • start with an inter-related set of measured
    variables
  • identify a smaller set of composite variables
    that can be constructed from the measured
    variables and that carry as much of their
    information as possible
  • A Full components solution ...
  • has as many PCs as variables
  • accounts for 100 of the variables variance
  • each variable has a final communality of 1.00
    all of its variance is accounted for by the full
    set of PCs
  • A Truncated components solution
  • has fewer PCs than variables
  • accounts for lt100 of the variables variance
  • each variable has a communality lt 1.00 -- not all
    of its variance is accounted for by the PCs

5
The basic steps of a PC analysis
  • Compute the correlation matrix
  • Extract a full components solution
  • Determine the number of components to keep
  • total variance accounted for
  • variable communalities
  • Rotate the components and interpret (name)
    them
  • Structure weights gt .3-.4 define which
    variables load
  • Compute component scores
  • Apply components solution
  • theoretically -- understand meaning of the data
    reduction
  • statistically -- use the component scores in
    other analyses
  • interpretability
  • replicability

6
PC Factor Extraction
  • Extraction is the process of forming PCs as
    linear combinations of the measured variables
  • PC1 b11X1 b21X2 bk1Xk
  • PC2 b12X1 b22X2
    bk2Xk
  • PCf b1fX1 b2fX2 bkfXk
  • Heres the thing to remember
  • We usually perform factor analyses to find out
    how many groups of related variables there are
    however
  • The mathematical goal of extraction is to
    reproduce the variables variance, efficiently

7
PC Factor Extraction, cont.
  • Consider R on the right
  • Obviously there are 2 kinds of information among
    these 4 variables
  • X1 X2 X3 X4
  • Looks like the PCs should be formed as,

X1 X2 X3 X4 X1 1.0 X2 .7
1.0 X3 .3 .3 1.0 X4 .3 .3
.5 1.0
  • PC1 b11X1 b21X2 -- capturing the
    information in X1 X2
  • PC2 b32X3 b42X4 -- capturing the
    information in X3 X4
  • But remember, PC extraction isnt trying to
    group variables it is trying to reproduce
    variance
  • notice that there are cross correlations
    between the groups of variables !!

8
PC Factor Extraction, cont.
  • So, because of the cross correlations, in order
    to maximize the variance reproduced, PC1 will be
    formed more like ...
  • PC1 .5X1 .5X2 .4X3 .4X4
  • Notice that all the variables contribute to
    defining PC1
  • Notice the slightly higher loadings for X1 X2
  • Because PC1 didnt focus on the X1 X2 variable
    group or X3 X4 variable group, there
    will still be variance to account for in both,
    and PC2 will be formed, probably something like
  • PC2 .3X1 .3X2 - .4X3 - .4X4
  • Notice that all the variables contribute to
    defining PC2
  • Notice the slightly higher loadings for X3 X4

9
PC Factor Extraction, cont.
  • While this set of PCs will account for lots of
    the variables variance -- it doesnt provide a
    very satisfactory interpretation
  • PC1 has all 4 variables loading on it
  • PC2 has all 4 variables loading on it and 2 of
    then have negative weights, even though all the
    variables are positively correlated with each
    other
  • The goal here was point out what extraction does
    (maximize variance accounted for) and what it
    doesnt do (find groups of variables)

10
Determining the Number of PCs
  • Determining the number of PCs is arguably the
    most important decision in the analysis
  • rotation, interpretation and use of the PCs are
    all influenced by the how may PCs are kept for
    those processes
  • there are many different procedures available
    none are guaranteed to work !!
  • probably the best approach to determining the
    of PCS
  • remember that this is an exploratory factoring
    -- that means you dont have decent RH about the
    number of factors
  • So Explore
  • consider different reasonable PCs and try
    them out
  • rotate, interpret /or tryout resulting factor
    scores from each and then decide

To get started well use the SPSS standard of
? gt 1.00
11
Statistical Procedures
  • PC analyses are extracted from a correlation
    matrix
  • PCs should only be extracted if there is
    systematic covariation in the correlation
    matrix
  • This is know as the sphericity question
  • Note the test asks if there the next PC should
    be extracted
  • There are two different sphericity tests
  • Whether there is any systematic covariation in
    the original R
  • Whether there is any systematic covariation left
    in the partial R, after a given number of factors
    has been extracted
  • Both tests are called Bartletts Sphericity Test

12
Statistical Procedures, cont.
  • Applying Bartletts Sphericity Tests
  • Retaining H0 means dont extract another
    factor
  • Rejecting H0 means extract the next factor
  • Significance tests provide a p-value, and so a
    known probability that the next factor is 1 too
    many (a type I error)
  • Like all significance tests, these are influenced
    by N
  • larger N more power more likely to reject H0
    more likely to keep the next factor ( make a
    Type I error)
  • Quandary?!?
  • Samples large enough to have a stable R are
    likely to have excessive power and lead to
    over factoring
  • Be sure to consider variance, replication
    interpretability

13
Mathematical Procedures
  • The most commonly applied decision rule (and the
    default in most stats packages -- chicken egg
    ?) is the ? gt 1.00 rule heres the logic
  • Part 1
  • Imagine a spherical R (of k variables)
  • each variable is independent and carries unique
    information
  • so, each variable has 1/kth of the information in
    R
  • For a normal R (of k variables)
  • each variable, on average, has 1/kth of the
    information in R

14
Mathematical Procedure, cont.
  • Part 2
  • The trace of a matrix is the sum of its
    diagonal
  • So, the trace of R (with 1s in the diag) k (
    vars)
  • ? tells the amount of variance in R accounted for
    by each extracted PC
  • for a full PC solution ? ? k (accounts for all
    variance)
  • Part 3
  • PC is about data reduction and parsimony
  • trading fewer more-complex things (PCs - linear
    combinations of variables) for fewer more-simple
    things (original variables)

15
Mathematical Procedure, cont.
  • Putting it all together (hold on tight !)
  • Any PC with ? gt 1.00 accounts for more variance
    than the average variable in that R
  • That PC has parsimony -- the more complex
    composite has more information than the average
    variable
  • Any PC with ? lt 1.00 accounts for less variance
    than the average variable in that R
  • That PC doesnt have parsimony -- the more
    complex composite has more no information than
    the average variable

16
Mathematical Procedure, cont.
  • There have been examinations the accuracy of this
    criterion
  • The usual procedure is to generate a set of
    variables from a known number of factors (vk
    b1kPC1 bfkPCf, etc.) --- while varying N,
    factors, PCs communalities
  • Then factor those variables and see if ? gt 1.00
    leads to the correct number of factors
  • Results -- the rule works pretty well on the
    average, which really means that it gets the
    factors right some times, underestimates
    sometimes and overestimates sometimes
  • No one has generated an accurate rule for
    assessing when which of these occurs
  • But the rule is most accurate with k lt 40, f
    between k/5 and k/3 and N gt 300

17
Nontrivial Factors Procedures
  • These common sense approaches became increasing
    common as
  • the limitations of statistical and mathematical
    procedures became better known
  • the distinction between exploratory and
    confirmatory factoring developed and the crucial
    role of successful exploring became better
    known
  • These procedures are more like judgement calls
    and require greater application of content
    knowledge and persuasion, but are often the
    basis of good factorings !!

18
Nontrivial factors Procedures, cont.
  • Scree -- the junk that piles up at the foot of
    an glacier
  • a diminishing returns approach
  • plot the ? for each factor and look for the
    elbow
  • Old rule -- factors elbow (1966 3 below)
  • New rule -- factors elbow - 1 (1967 2
    below)
  • Sometimes there isnt a clear elbow -- try
    another rule
  • This approach seems to work best when combined
    with attention to interpretability !!

? 4 2 0
PC 1 2 3 4 5 6
19
An Example
A buddy in graduate school wanted to build a
measure of contemporary morality. He started
with the 10 Commandments and the 7 Deadly
Sins and created a 56-item scale with 8
subscales. His scree plot looked like How many
factors?
?
1? big elbow at 2, so 67 rule suggests a
single factor, which clearly accounts
for the biggest portion of variance 7? smaller
elbow at 8, so 67 rule suggests 7 8? smaller
elbow at 8, 66 rule gives the 8 he was looking
for also 8th has ? gt 1.0 and 9th had ? lt 1.0
0 1 10 20
1 8 20
40 56
  • Remember that these are subscales of a central
    construct, so..
  • items will have substantial correlations both
    within and between subscales
  • to maximize the variance accounted for, the
    first factor is likely to pull in all these
    inter-correlated variables, leading to a large ?
    for the first (general) factor and much smaller
    ?s for subsequent factors
  • This is a common scree configuration when
    factoring items from a multi-subscale scale!

20
Rotation finding groups in the variables
  • Factor Rotations
  • changing the viewing angle or head tilt of
    the factor space
  • makes the groupings visible in the graph apparent
    in the structure matrix

Unrotated Structure PC1 PC2 V1
.7 .5 V2 .6 .6 V3 .6 -.5 V4
.7 -.6
PC1
Rotated Structure PC1 PC2 V1 .7
-.1 V2 .7 .1 V3 .1 .5 V4 .2
.6
PC2
V2
V1
PC1
V3
V4
PC2
21
Interpretation Naming groups in the variables
  • Usually interpret factors using the rotated
    solutions using the rotated
  • Factors are named for the variables correlated
    with them
  • Usual cutoffs are /- .3 - .4
  • So a variable that shares at least 9-16 of
    its variance with a factor is used to name that
    factor
  • Variables may load on none, 1 or 2 factors

Rotated Structure PC1 PC2 V1 .7
-.1 V2 .7 .1 V3 .1 .5 V4 .2
.6
This rotated structure is easy PC1 is V1 V2
PC2 is V3 V4 It is seldom this easy !?!?!
22
Kinds of Factors
  • General Factor
  • all or almost all variables load
  • there is a dominant underlying theme among the
    set of variables which can be represented with a
    single composite variable
  • Group Factor
  • some subset of the variables load
  • there is an identifiable sub-theme in the
    variables that must be represented with a
    specific subset of the variables
  • smaller vs. larger group factors ( vars
    variance)
  • Unique Factor
  • single variable loads

23
Kinds of Variables
  • Univocal variable -- loads on a single factor
  • Multivocal variable -- loads on 2 factors
  • Nonvocal variable -- doesnt load on any factor
  • You should notice a pattern here
  • a higher cutoff (e.g., .40) tends to produce
  • fewer variables loading on a given factor
  • less likely to have a general factor
  • fewer multivocal variables
  • more nonvocal variables
  • a lower cutoff (e.g., .30) tends to produce
  • more variables loading on a given factror
  • more likely to have a general factor
  • more multivocal variables
  • fewer nonvocal variables
Write a Comment
User Comments (0)
About PowerShow.com