Title: X1
1x
Exploratory Factor Analysis
X1
X3
Xk
X2
d1
d2
d3
dk
2- Factor analysis and principal components analysis
are often used for the same purposes but they
have different models that sometimes make one
more appropriate than the other for certain
statistical goals. - Principal components analysis seeks linear
combinations that best capture the variation in
the original variables. - Factor analysis seeks linear combinations that
best capture the correlations among the original
variables.
3The two approaches will often arrive at very
similar results, but not always, so knowing the
underlying models can help resolve differences
when they do arise, and, help guide the most
appropriate choice of procedure to begin with.
4- In factor analysis, the observed variance in each
measure is assumed to be attributable to - Common factors that influence more than one
measure - A specific factor that is idiosyncratic to each
measure. The specific factor explicitly
acknowledges that measures are faulty and have a
part of their variance that is random.
5In factor analysis we assume an explicit
measurement model that specifies the causes of
variation in observed measurements. Some causes
are unobservable (latent) variables that affect
more than one measure. Other causes are
unobservable (latent) variables that are unique
to each measure. The specific factors are assumed
to be uncorrelated with each other, and,
uncorrelated with the common factors.
6(No Transcript)
7In a simple one-factor model, each measure is
assumed to be a simple linear combination of the
common factor and a specific factor. The specific
factor is assumed to include both unique
systematic influences and random error.
For each X, l represents the extent to which each
measure reflects the underlying common factor, x.
8In the simple one-factor model, the variance in a
measure is assumed to be captured completely by
variance due to the common factor and variance
due to the specific factor. The two sources are
assumed to be uncorrelated and so their variances
are additive.
If the variables are standardized with variance
of 1.00, then l is a correlation coefficient and
l2 is the proportion of variance in X due to the
common factor. This is the communality of X. The
remaining variance is assumed to be due to random
sources.
9The communality in X can be generally defined as
the proportion of variance in X that is due to
the common factors, no matter how many their
number. This means that the communality can also
be defined as
where q2 is used to indicate the specific factor
variance.
10Measurement models can be more complex . . .
x1
x2
lk1
l12
lk2
l31
l11
l22
l21
l32
X1
X3
Xk
X2
d1
d2
d3
dk
11Adding more common factors requires expanding the
linear combinations for X to accommodate the
additional common sources of variance.
12Provided the additional common factors are
assumed to be independent, each l is still a
correlation coefficient and the variance of X can
be viewed as the sum of independent proportions
of variance
and the communality is still
13Because factor analysis only seeks to identify
the common factors that influence the
correlations among measures, it is not a
correlation matrix that is analyzed. The
correlation matrix contains ones on the main
diagonal, implying an attempt to account for all
of the variance in Xthe goal of principal
components analysis. Instead, in factor analysis
the main diagonal is replaced by the
communalitiesthe variances in X that are due
only to the common factors.
But, the communalities are not known in advance
of the factor analysis, giving rise to the
communality problem and the need to solve for the
common factors iteratively.
14The analysis must begin with initial and perhaps
crude estimates of the communalities on the main
diagonal of the correlation matrix. These can
then be used to derive initial and perhaps crude
estimates of the common factors. The initial
estimates of the common factors can be used to
generate improved (but perhaps still crude)
estimates of the communalities, which can then be
substituted into the main diagonal of the
correlation matrix.
15The process of substituting better communality
estimates to derive better approximations to the
common factors continues until little change
occurs from one iteration to the next. The
procedure converges on the best estimates for the
common factors and the communalities.
16In the principal axes approach to factor
analysis, the only difference compared to
principal components is that the matrix being
analyzed is a correlation matrix (which is also a
variance-covariance matrix for standardized
variables) in which the main diagonal contains
the communalities rather than the variances.
17One potentially good starting value for the
communality of any given measure is the squared
multiple correlation of that measure with all of
the other measures in the X matrix. These are
lower bound estimates of the true communality
because each measure is assumed to have an error
component that attenuates the relations between
measures (which are presumed to reflect the
common factors).
18The result of a factor analysis is the
identification of linear combinations of
measures, but these linear combinations are
different from those derived by principal
components analysis. In principal components
analysis, the linear combinations are optimal
in the sense they have larger variances than any
other linear combination that could have been
derived.
19In factor analysis, the linear combinations
merely reflect the minimum number of common
variance sources necessary to capture the
correlations among the measures. Their location
or orientation relative to the original reference
system has no particular privileged status. This
creates what is known as the rotational
indeterminacy of common factors and motivates the
search for orientations that have some optimal
meaning.
20The purpose of factor rotation is to provide an
orientation that achieves an easier
interpretation of the underlying common factors.
This is often referred to as the search for
simple structurereflecting the fact that the
factor loadings (in the factor structure matrix)
are used to infer the meaning of the factors.
21- Simple structure occurs when
- Most of the loadings on any given factor are
small and a few loadings are large in absolute
value - Most of the loadings for any given variable are
small, with ideally only one loading being large
in absolute value. - Any pair of factors has dissimilar patterns of
loadings.
22The matrix representation of factor analysis
reveals its close ties to principal components
analysis as well as its key point of departure.
23The solution to the common factor model begins in
same way that principal components analysis
began, with the singular value decomposition of
the standardized data matrix, X
The correlation matrix among the measures in X
can be defined in matrix form as
24We can substitute in the definition of X in terms
of its component matrices
25We can rearrange terms because (AB) (BA) and
(D½) D½. We can further simplify because the
correlation matrix for Zs is an identity matrix
26This form is a reminder that the correlation
matrix among the original measures, which is a
variance-covariance matrix for standardized
scores, can be obtained by applying the weights
(U eigenvectors) for creating the linear
combinations to the variance-covariance matrix
for those linear combinations (D, with
eigenvalues on the diagonal).
27Another way to represent R is in terms of the
factor loadings, which are just rescaled
eigenvectors
28Factor analysis also tries to approximate the
correlation matrix, R, but does not attempt to
decompose all of X. Instead, only the common
factor variance is of interest. This means that
rather than substituting ZsD½U for X into the
formula for R, we must instead substitute X as
defined by the common factor model.
29In matrix form, the common factor model can be
represented as
30In the common factor model, we make the following
assumptions
The common factors are uncorrelated.
The specific factors have a diagonal covariance
matrix.
The common factors and specific factors are
uncorrelated.
31Substituting the common factor definitions for X
An identity matrix
Expected to be zero
Expected to be zero
The matrix that is reproduced by factor analysis
is the correlation matrix less the specific
factor variances.
32We may not find the original location of the
factors to provide an easy interpretation
The factors can be rotated to a new position
Where T is matrix of direction cosines that
indicate the location of the new axes compared to
the original axes.
33The location of the new axes is determined by
simple structure, which ideally might look like
this for the factor loadings
34One way to approach this ideal pattern is to find
the rotation that maximizes the variance of the
loadings in the columns of the factor structure
matrix. This approach was suggested by Kaiser and
is called varimax rotation.
35A second way to approach this ideal pattern is to
find the rotation that maximizes the variance of
the loadings in the rows of the factor structure
matrix. This approach is called quartimax
rotation.
36For those who want the best of both worlds,
equimax rotation attempts to satisfy both goals.
Varimax is the most commonly used and the three
rarely produce results that are very discrepant.
37Huba et al. (1981) collected data on drug use
reported by 1634 students (7th to 9th grade) in
Los Angeles. Participants rated their use on a
5-point scale 1 never tried, 2 only once, 3
a few times, 4 many times, 5 regularly.
38The analysis begins in the same way as principal
components analysis. It would make little sense
to search for common factors in an identity
matrix
39Unlike principal components analysis, factor
analysis will not attempt to explain all of the
variance in each variable. Only common factor
variance is of interest. This creates the need
for some initial estimates of communalities.
40The number of factors to extract is guided by the
size of the eigenvalues, as it was in principal
components analysis. But, not all of the variance
can be accounted for in the variables.
41To the extent there is random error in the
measures, the eigenvalues for factor analysis
will be smaller than the corresponding
eigenvalues in principal components analysis.
42The location of the factors might be rotated to a
position that allows easier interpretation. This
will shift the variance, but preserve the total
amount accounted for.
43Two factors appear to be sufficient. The
attenuated eigenvalues will generally tell the
same story, but the comparison is against a value
of 0. Why?
44The loadings will be reduced in factor analysis
because not all of the variance in X is due to
common factor variance.
45As in principal components analysis we can judge
the quality of the solution by examining the
reproduced correlation matrix.
46Rotating the factors to simple structure makes
the interpretation easier. The first factor
appears to be minor recreational drug use. The
second factor appears to be major abusive drug
use.
47(No Transcript)
48Scores on the underlying common factors can be
obtained. The key difference compared to
principal components analysis is that variables
are assumed to be measured with error in factor
analysis.
These are often referred to as regression-based
factor scores.
49Nonetheless, there are parallels in the way
scores are derived in factor analysis and the way
they are derived in principal components analysis.
50(No Transcript)
51A sample of 303 MBA students were asked to
evaluate different makes of cars using 16
different adjectives rated on a 5-point agreement
scale (1 strongly disagree, 5 strongly
agree) This car is an exciting car.
52Are all 16 individual ratings required to
understand product evaluation, or, is there a
simpler measurement model?
53The analysis must begin with some idea of the
proportion of variance in each variable that can
be attributed to the common factors. The most
common initial estimate is the squared multiple
correlation between a given measure and all the
remaining measures.
54(No Transcript)
55Three common factors appear to underlie the 16
evaluative ratings.
56The three common factors account for two-thirds
of the common factor variance.
57The adequacy of the three-factor model can be
judged in part by examining the reproduced
correlation matrix and the residual correlation
matrix. Note the generally small values in the
residual matrix.
58Without rotation, the interpretation of the
factors is not readily apparent, especially the
second and third factors.
59The varimax rotation makes the interpretation a
bit easier. What would you name these factors?
Factor 1?
Factor 2?
Factor 3?
60The transformation matrix (T) moves the original
factor axes to a new position defined by simple
structure (varimax criterion in this case).
TT1
These values are the direction cosines that
relate the original orientation of the axes to
the position of the new axes.
61(No Transcript)
62(No Transcript)
63(No Transcript)
64(No Transcript)
65It is rare for the different rotational criteria
to produce different results
66(No Transcript)
67Scores for the three factors can potentially make
other analyses easier to interpret. In the same
study, 10 different makes of automobile were the
targets of the ratings. On a variable by variable
basis, the information can become overwhelming,
with patterns and consistencies getting lost amid
the details.
68(No Transcript)
69(No Transcript)
70(No Transcript)
71(No Transcript)
72(No Transcript)
73(No Transcript)
74(No Transcript)
75(No Transcript)
76(No Transcript)
77(No Transcript)
78(No Transcript)
79(No Transcript)
80(No Transcript)
81(No Transcript)
82(No Transcript)
83(No Transcript)
84(No Transcript)
85(No Transcript)
86All of the measures produce significant
differences. An analysis of the factor scores has
the potential to bring any patterns in these
differences into sharp relief.
87(No Transcript)
88(No Transcript)
89(No Transcript)
90The 16 overlapping significance tests for the
individual measures are reduced to three,
nonredundant tests.
91Would we change any of our factor definitions in
light of the differences?
How might data such as these be used for some
practical end?
92Advocates of factor analysis often claim that it
is inappropriate to apply principal components
procedures in the search for meaning or latent
constructs. But, does it really matter all that
much? To the extent that the communalities for
all variables are high, the two procedures should
give very similar results. When the commonalities
are very low, then factor analysis results may
depart from principal components.
93(No Transcript)
94(No Transcript)
95(No Transcript)
96- Factor analysis offers a more realistic model of
measurement by admitting the presence of random
error and specific, systematic sources of
variability. - Another way to make the model more realistic is
to relax the restriction that factors be
orthogonal. Allowing oblique factors has two
potential benefits - It allows the model to better match the
actual data - It allows the possibility of higher order
factors
97Oblique rotation relaxes the requirement that
factors be independent. This requires the
addition of a new matrixthe factor pattern
matrix. One way to appreciate why this matrix is
necessary is to remember that when factors are
orthogonal, the variables can be reconstructed
from those factors using a very simple linear
combination of the standardized factor scores
98This linear combination resembles a multiple
regression equation, but because the predictors
are independent, the regression coefficients or
weights are simply correlations---just the
elements of the factor loading (structure)
matrix. When the factors are no longer
orthogonal, then the correlations are not the
appropriate weights and a separate matrix
containing those weights is necessary.
99The factor pattern matrix contains the
regression weights for reconstructing the
variable from the factors. These weights
appropriately take into account the correlations
among the factors (the predictors) and indicate
the unique role played by any one of the
predictors in reconstructing a given variable.
100x2
Perpendicular projections onto the axes are the
correlationsthe elements of the factor structure
matrix.
v
x1
x1
Parallel projections onto the axes are the
pattern coefficientsthe elements of the factor
pattern matrix.
101Because the factor pattern matrix represents the
unique contribution of each factor to the
reconstruction of any variable in X, it provides
a better basis for judging simple
structure. Available techniques for achieving
simple structure for oblique rotation (e.g.,
promax, direct oblimin) confront an additional
problem---the specification for the amount of
correlation among the factors. Unlike orthogonal
rotation in which these correlations are fixed at
0, in oblique rotation, simple structure can be
sought for any correlations among the factors.
Default values in software can be manipulated to
search for the optimal combination.
102Hypothetical data (N 500) were created for
individuals completing a 12-section test of
mental abilities. All variables are in standard
form.
103The correlation matrix is not an identity matrix
104The scree test clearly shows the presence of
three factors
105On average, the three factors extracted can
accounted for about half of the variance in the
individual subtests.
106Factor analysis accounts for less variance than
principal components and rotation shifts the
variance accounted for by the factors.
Why does the sum of the squared rotated loadings
not equal the sum of the squared unrotated
loadings?
107The initial extraction . . .
108Oblique rotation is much clearer in the pattern
matrix than in the structure matrix
109The correlations among the factors suggest the
presence of a higher order factor
Why are some of the correlations negative when
one would expect all mental abilities to be
positively correlated?
110An alternative oblique rotation---Promax---provide
s much the same answer
The order of the factors may vary, and one gets
reflected, but the essential interpretation is
the same.
111The correlations among the factors are similar
for the two procedures.
112The factor correlations can also be factor
analyzed, producing a higher-order factor
analysis
Matrix data Variables rowtype_ Verbal Math
Analytic. Begin data N 500 500 500 corr
1.000 corr 0.376 1.000 corr 0.448 0.402
1.000 end data.
113FACTOR matrix IN(cor) /VARIABLES Verbal Math
Analytic /MISSING LISTWISE /ANALYSIS Verbal
Math Analytic /PRINT UNIVARIATE INITIAL
CORRELATION SIG DET KMO INV REPR AIC
EXTRACTION /PLOT EIGEN /CRITERIA kaiser
/EXTRACTION PAF /ROTATION varimax
/METHODCORRELATION .
114The factor intercorrelations are not an identity
matrix
115A single factor appears appropriate
116The single factor can account for about 40 of
the variance in the first-order factors
117Note the substantial difference in the principal
components and factor analysis results.
118Each first-order factor loads highly on the
second-order factor.
119General Mental Abilities
Verbal
Analytic
Math
d1
d2
d3
d4
d9
d10
d11
d12
d6
d5
d7
d8
120The factor pattern matrix relating the higher
order factor(s) to the original variables can be
found by Pvh PvfPfh
121Factor analysis can easily capitalize on chance,
making it highly desirable to cross-validate the
results. The stability of factor analysis
estimates in a single sample can be gauged
through methods such the jackknife and bootstrap
techniques. A more convincing cross-validation
occurs when the results are replicated in a new
sample. When a second sample is available, an
especially convincing approach is to calculate
the factor scores in each sample using the data
from that sample, but the factor score weight
matrix from the other sample. Correlations among
corresponding factor scores within samples should
be high if the factor solutions are not due to
chance.
122The mental abilities data set was split into two
subsamples
Sample 1 Sample 2
123 Sample 1 Sample 2
124Similar underlying structures are suggested in
the two samples
Sample 1 Sample 2
125Generally similar patterns to the communalities
Sample 1 Sample 2
126 Sample 1
Sample 2
127Same unrotated factors emerge. The difference in
order is not important
Sample 1 Sample 2
128Simple structure is the same in the two samples.
One simple test of stability across the samples
would be to correlate the columns of these
matrices.
Sample 1 Sample 2
129The factor score coefficient matrices are similar
Sample 1 Sample 2
130Why arent these identity matrices?
Sample 1 Sample 2
131In each sample, actual factor scores are computed
(using the weight matrix from that sample). In
addition, estimated factor scores are calculated
using the weight matrix from the other sample.
Sample 1 Sample 2
132 Sample 1
133Sample 2
134Factor analysis is also sometimes used as a way
to cluster individuals. In fact, there is nothing
in the procedure that prevents any data matrix
from being analyzed. A very common approach is to
transpose a People x Variables matrix and to
factor analyze the resulting Variables x People
matrix. This approach then finds the linear
combinations of people that can best account for
the variability in the matrix, resulting in
clusters of people that have similar profiles on
the variables.
135A sample of medical students (N 323) completed
a brief questionnaire about their attitudes
regarding overweight and obese people to
determine the degree of anti-fat bias that might
exist in this sample. The questions were rated on
a 7-point agreement scale.
136- One component of anti-fat bias is a strong
dislike of the obese - I really dont like fat people much.
- I dont have many friends that are fat.
- I tend to think that people who are overweight
are a little untrustworthy. - Although some fat people are surely smart, in
general, I think they tend not to be quite as
bright as normal weight people. - I have a hard time taking fat people too
seriously. - Fat people make me feel somewhat
uncomfortable. - If I were an employer looking to hire, I might
avoid hiring a fat person.
137- A second component of anti-fat bias is a strong
fear of becoming fat - I feel disgusted with myself when I gain
weight. - One of the worst things that could happen to
me would be if I gained 25 pounds. - I worry about becoming fat.
138- A third component of anti-fat bias is the belief
that the underlying cause of obesity is a lack of
willpower - People who weight too much could lose at least
some part of their weight through a little
exercise. - Some people are fat because they have no
willpower. - Fat people tend to be fat pretty much through
their own fault.
139We could approach this in the standard way. Three
factors appear to underlie the data
140The communalities vary quite a bit, but on
average about 50 of the item variability can be
accounted for by three factors
141Note that one of the retained factors has an
eigenvalue less than 1.00
142The orthogonal rotation suggests that a
three-factor solution might be correct.
143It seems possible that the three facets of
anti-fat bias might be correlated. An oblique
rotation confirms this
144Another way to approach the same data is to
search for clusters of people who have similar
profiles on the 13 questions. This form of factor
analysis is called a Q-type analysis (rather than
the usual R-type analysis). For the analysis to
make sense, the matrix must be doubly
standardized. The variables are standardized to
remove any differences in scale. Then the matrix
is transposed and the people are standardized as
part of the factor analysis (in the formation of
a correlation matrix).
145GET FILE'C\Research\Antifat Bias\antifat
transposed and standardized.sav'. FLIP
VARIABLESzq1 zq2 zq3 zq4 zq5 zq6 zq7 zq8 zq9
zq10 zq11 zq12 zq13 . FLIP performed on 323
cases and 13 variables, creating 13 cases and 324
variables. The working file has been replaced.
146FACTOR /VARIABLES var001 var002 var003 var004
var005 var006 var007 var008 var009 var010
var011 var012 var013 var014 var015 var016
var017 var018 var019 var020 /MISSING LISTWISE
/ANALYSIS var001 var002 var003 var004 var005
var006 var007 var008 var009 var010 var011 var012
var013 var014 var015 var016 var017 var018
var019 var020 /PRINT INITIAL EXTRACTION
ROTATION /PLOT EIGEN /CRITERIA FACTORS(3)
ITERATE(25) /EXTRACTION PC /CRITERIA
ITERATE(25) /ROTATION VARIMAX
/METHODCORRELATION .
147Factor analysis always works best when there are
many more sampling units than elements being
correlated. That usually means many more people
than variables. Here it means we need to limit
the number of people in the analysis. Well focus
on just the first 20 people (even this is
riskysimilar to using multiple regression with
more predictors than people).
148Looks like there might be three dimensions or
clusters of people, but the scree is not at all
clear.
149(No Transcript)
150(No Transcript)
151The loadings for people can help identify those
that have similar profiles on the variables.
Ordinarily, the interpretations are assisted by
other information that might identify why
clusters of people have similar profiles. The
clusters might then be used in other analyses.
152- Other issues in factor analysis
- How many people? (5, 10, 20, 100, 200, 400)
- The number of variables per factor (4 to 6)
- Factoring items versus factoring scales
- Assumptions and distributions
- Over-factoring versus under-factoring
- PCA versus FA
- Reliability for factor score composites
153Exploratory factor analytic methods are sometimes
used as a crude way to confirm hypotheses about
the latent structure underlying a data set. As a
first pass, these methods do just fine. But, more
powerful confirmatory factor analytic procedures
exist that can better address questions about
data that are strongly informed by theory.
154(No Transcript)