Principal Components Analysis with SAS
  • Karl L. Wuensch
  • Dept of Psychology
  • East Carolina University

When to Use PCA
  • You have a set of p continuous variables.
  • You want to repackage their variance into m
  • You will usually want m to be lt p, but not

Components and Variables
  • Each component is a weighted linear combination
    of the variables
  • Each variable is a weighted linear combination of
    the components.

Factors and Variables
  • In Factor Analysis, we exclude from the solution
    any variance that is unique, not shared by the
  • Uj is the unique variance for Xj

Goals of PCA and FA
  • Data reduction.
  • Discover and summarize pattern of
    intercorrelations among variables.
  • Test theory about the latent variables underlying
    a set a measurement variables.
  • Construct a test instrument.
  • There are many others uses of PCA and FA.

Data Reduction
  • Ossenkopp and Mazmanian (Physiology and Behavior,
    34 935-941).
  • 19 behavioral and physiological variables.
  • A single criterion variable, physiological
    response to four hours of cold-restraint
  • Extracted five factors.
  • Used multiple regression to develop a multiple
    regression model for predicting the criterion
    from the five factors.

Exploratory Factor Analysis
  • Want to discover the pattern of intercorrleations
    among variables.
  • Wilt et al., 2005 (thesis).
  • Variables are items on the SOIS at ECU.
  • Found two factors, one evaluative, one on
    difficulty of course.
  • Compared FTF students to DE students, on
    structure and means.

Confirmatory Factor Analysis
  • Have a theory regarding the factor structure for
    a set of variables.
  • Want to confirm that the theory describes the
    observed intercorrelations well.
  • Thurstone Intelligence consists of seven
    independent factors rather than one global

Construct Test Instrument
  • Write a large set of items designed to test the
    constructs of interest.
  • Administer the survey to a sample of persons from
    the target population.
  • Use FA to help select those items that will be
    used to measure each of the constructs of
  • Use Cronbachs alpha to check reliability of
    resulting scales.

An Unusual Use of PCA
  • Poulson, Braithwaite, Brondino, and Wuensch
    (1997, Journal of Social Behavior and
    Personality, 12, 743-758).
  • Simulated jury trial, seemingly insane defendant
    killed a man.
  • Criterion variable recommended verdict
  • Guilty
  • Guilty But Mentally Ill
  • Not Guilty By Reason of Insanity.

  • Predictor variables jurors scores on 8 scales.
  • Discriminant function analysis.
  • Problem with multicollinearity.
  • Used PCA to extract eight orthogonal components.
  • Predicted recommended verdict from these 8
  • Transformed results back to the original scales.

A Simple, Contrived Example
  • Consumers rate importance of seven
    characteristics of beer.
  • low Cost
  • high Size of bottle
  • high Alcohol content
  • Reputation of brand
  • Color
  • Aroma
  • Taste

  • Download from http//
    c/wuenschk/SAS/SAS-Programs.htm .
  • Bring it into SAS.
  • Run the program. Look at the output.

Checking for Unique Variables 1
  • Check the correlation matrix (page 1 of output).
  • If there are any variables not well correlated
    with some others, might as well delete them.
  • Or add more variables expected to be correlated
    with them.
  • Can still include deleted variables in post-PCA

Checking for Unique Variables 2
  • Correlation Matrix
  • cost size alcohol reputat color aroma taste
  • cost 1.00 .832 .767 -.406 .018 -.046 -.064
  • size .832 1.00 .904 -.392 .179 .098 .026
  • alcohol .767 .904 1.00 -.463 .072 .044 .012
  • reputat -.406 -.392 -.463 1.00 -.372 -.443 -.443
  • color .018 .179 .072 -.372 1.00 .909 .903
  • aroma -.046 .098 .044 -.443 .909 1.00 .870
  • taste -.064 .026 .012 -.443 .903 .870 1.00

Checking for Unique Variables 3
  • For each variable, check R2 between it and the
    remaining variables. You will see these when we
    cover factor analysis.
  • Look at partial correlations variables with
    large partial correlations share variance with
    one another but not with the remaining variables
    this is problematic.
  • See page 2 of the output.

Checking for Unique Variables 4
  • Kaisers MSA will tell you, for each variable,
    how much of this problem exists.
  • The smaller the MSA, the greater the problem.
  • An MSA of .9 is marvelous, .5 miserable.
  • See page 2 of the output.
  • Typically we would have more than seven
    variables, and MSA would be likely be larger.

Extracting Principal Components 1
  • From p variables we can extract p components.
  • Each of p eigenvalues represents the amount of
    standardized variance that has been captured by
    one component.
  • The first component accounts for the largest
    possible amount of variance.
  • The second captures as much as possible of what
    is left over, and so on.
  • Each is orthogonal to the others.

Extracting Principal Components 2
  • Each variable has standardized variance 1.
  • The total standardized variance in the p
    variables p.
  • The sum of the m p eigenvalues p.
  • All of the variance is extracted.
  • For each component, the proportion of variance
    extracted eigenvalue / p.

Extracting Principal Components 3
  • For our beer data, here are the eigenvalues and
    proportions of variance for the seven components

How Many Components to Retain
  • From p variables we can extract p components.
  • We probably want fewer than p.
  • Simple rule Keep as many as have eigenvalues ?
  • A component with eigenvalue lt 1 captured less
    than one variables worth of variance.

  • Visual Aid Use a Scree Plot
  • Scree is rubble at base of cliff.
  • See page 3 of the output.

  • Only the first two components have eigenvalues
    greater than 1.
  • Big drop in eigenvalue between component 2 and
    component 3.
  • Components 3-7 are scree.
  • By default, SAS will retain all components with
    eigenvalues of 1 or more.
  • Should also look at a solution with one fewer
    component and one with one more component.

Loadings, Unrotated and Rotated
  • Loading matrix factor pattern matrix
    component matrix.
  • Each loading is the Pearson r between one
    variable and one component.
  • Since the components are orthogonal, each loading
    is also a ß weight from predicting X from the
  • Here are the unrotated loadings for our 2
    component solution

Factor Pattern Matrix
Pre-Rotation Loadings
  • All variables load well on first component,
    economy and quality vs. reputation.
  • Second component is more interesting, economy
    versus quality.
  • See page 4 of the output.

  • See the preplot on page 5 of output.

Rotate the Axes
  • Rotate these axes so that the two dimensions pass
    more nearly through the two major clusters (COST,
  • The number of degrees by which I rotate the axes
    is the angle PSI. For these data, rotating the
    axes -40.63 degrees has the desired effect.

Loadings After Rotation
Components After Rotation
  • Component 1 Quality versus reputation.
  • Component 2 Economy (or cheap drunk) versus
  • Page 6 of output.

  • See the postplot on page 7 of the output.

Number of Components in the Rotated Solution
  • Try extracting one fewer component, try one more
  • Which produces the more sensible solution?
  • Error difference in obtained structure and true
  • Overextraction (too many components) produces
    less error than underextraction.
  • If there is only one true factor and no unique
    variables, can get factor splitting.

  • In this case, first unrotated factor ? true
  • But rotation splits the factor, producing an
    imaginary second factor and corrupting the first.
  • Can avoid this problem by including a garbage
    variable that will be removed prior to the final

Explained Variance
  • Square the loadings and then sum them across
  • Get, for each component, the amount of variance
  • Prior to rotation, these are eigenvalues.
  • Our SAS output shows the SSL for each component
    on page 6, just below the rotated factor pattern.

  • After rotation the two components together
    account for (3.02 2.91) / 7 85 of the total
    variance. If the last component has a small SSL,
    one should consider dropping it.
  • If SSL 1, the component has extracted one
    variables worth of variance.
  • If only one variable loads well on a component,
    the component is not well defined.
  • If only two load well, it may be reliable, if the
    two variables are highly correlated with one
    another but not with other variables.

Naming Components
  • For each component, look at how it is correlated
    with the variables.
  • Try to name the construct represented by that
  • If you cannot, perhaps you should try a different
  • I have named our components aesthetic quality
    and cheap drunk.

  • For each variable, sum the squared loadings
    across components.
  • This gives you the R2 for predicting the variable
    from the components,
  • which is the proportion of the variables
    variance which has been extracted by the
  • See page 4 of the output.

Orthogonal Rotations
  • Varimax -- minimize the complexity of the
    components by making the large loadings larger
    and the small loadings smaller within each
  • Quartimax -- makes large loadings larger and
    small loadings smaller within each variable.
  • Equamax a compromize between these two.

Oblique Rotations
  • Axes drawn through the two clusters in the upper
    right quadrant would not be perpendicular.

  • May better fit the data with axes that are not
    perpendicular, but at the cost of having
    components that are correlated with one another.
  • More on this later.
