X1 - PowerPoint PPT Presentation

1 / 153
About This Presentation
Title:

X1

Description:

We can rearrange terms because (AB)` = (B`A`) and (D )` = D ... some part of their weight through a little exercise. Some people are fat because they have no ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 154
Provided by: michael1175
Category:
Tags:

less

Transcript and Presenter's Notes

Title: X1


1
x
Exploratory Factor Analysis
X1
X3
Xk
X2
d1
d2
d3
dk
2
  • Factor analysis and principal components analysis
    are often used for the same purposes but they
    have different models that sometimes make one
    more appropriate than the other for certain
    statistical goals.
  • Principal components analysis seeks linear
    combinations that best capture the variation in
    the original variables.
  • Factor analysis seeks linear combinations that
    best capture the correlations among the original
    variables.

3
The two approaches will often arrive at very
similar results, but not always, so knowing the
underlying models can help resolve differences
when they do arise, and, help guide the most
appropriate choice of procedure to begin with.
4
  • In factor analysis, the observed variance in each
    measure is assumed to be attributable to
  • Common factors that influence more than one
    measure
  • A specific factor that is idiosyncratic to each
    measure. The specific factor explicitly
    acknowledges that measures are faulty and have a
    part of their variance that is random.

5
In factor analysis we assume an explicit
measurement model that specifies the causes of
variation in observed measurements. Some causes
are unobservable (latent) variables that affect
more than one measure. Other causes are
unobservable (latent) variables that are unique
to each measure. The specific factors are assumed
to be uncorrelated with each other, and,
uncorrelated with the common factors.
6
(No Transcript)
7
In a simple one-factor model, each measure is
assumed to be a simple linear combination of the
common factor and a specific factor. The specific
factor is assumed to include both unique
systematic influences and random error.
For each X, l represents the extent to which each
measure reflects the underlying common factor, x.
8
In the simple one-factor model, the variance in a
measure is assumed to be captured completely by
variance due to the common factor and variance
due to the specific factor. The two sources are
assumed to be uncorrelated and so their variances
are additive.
If the variables are standardized with variance
of 1.00, then l is a correlation coefficient and
l2 is the proportion of variance in X due to the
common factor. This is the communality of X. The
remaining variance is assumed to be due to random
sources.
9
The communality in X can be generally defined as
the proportion of variance in X that is due to
the common factors, no matter how many their
number. This means that the communality can also
be defined as
where q2 is used to indicate the specific factor
variance.
10
Measurement models can be more complex . . .
x1
x2
lk1
l12
lk2
l31
l11
l22
l21
l32
X1
X3
Xk
X2
d1
d2
d3
dk
11
Adding more common factors requires expanding the
linear combinations for X to accommodate the
additional common sources of variance.
12
Provided the additional common factors are
assumed to be independent, each l is still a
correlation coefficient and the variance of X can
be viewed as the sum of independent proportions
of variance
and the communality is still
13
Because factor analysis only seeks to identify
the common factors that influence the
correlations among measures, it is not a
correlation matrix that is analyzed. The
correlation matrix contains ones on the main
diagonal, implying an attempt to account for all
of the variance in Xthe goal of principal
components analysis. Instead, in factor analysis
the main diagonal is replaced by the
communalitiesthe variances in X that are due
only to the common factors.
But, the communalities are not known in advance
of the factor analysis, giving rise to the
communality problem and the need to solve for the
common factors iteratively.
14
The analysis must begin with initial and perhaps
crude estimates of the communalities on the main
diagonal of the correlation matrix. These can
then be used to derive initial and perhaps crude
estimates of the common factors. The initial
estimates of the common factors can be used to
generate improved (but perhaps still crude)
estimates of the communalities, which can then be
substituted into the main diagonal of the
correlation matrix.
15
The process of substituting better communality
estimates to derive better approximations to the
common factors continues until little change
occurs from one iteration to the next. The
procedure converges on the best estimates for the
common factors and the communalities.
16
In the principal axes approach to factor
analysis, the only difference compared to
principal components is that the matrix being
analyzed is a correlation matrix (which is also a
variance-covariance matrix for standardized
variables) in which the main diagonal contains
the communalities rather than the variances.
17
One potentially good starting value for the
communality of any given measure is the squared
multiple correlation of that measure with all of
the other measures in the X matrix. These are
lower bound estimates of the true communality
because each measure is assumed to have an error
component that attenuates the relations between
measures (which are presumed to reflect the
common factors).
18
The result of a factor analysis is the
identification of linear combinations of
measures, but these linear combinations are
different from those derived by principal
components analysis. In principal components
analysis, the linear combinations are optimal
in the sense they have larger variances than any
other linear combination that could have been
derived.
19
In factor analysis, the linear combinations
merely reflect the minimum number of common
variance sources necessary to capture the
correlations among the measures. Their location
or orientation relative to the original reference
system has no particular privileged status. This
creates what is known as the rotational
indeterminacy of common factors and motivates the
search for orientations that have some optimal
meaning.
20
The purpose of factor rotation is to provide an
orientation that achieves an easier
interpretation of the underlying common factors.
This is often referred to as the search for
simple structurereflecting the fact that the
factor loadings (in the factor structure matrix)
are used to infer the meaning of the factors.
21
  • Simple structure occurs when
  • Most of the loadings on any given factor are
    small and a few loadings are large in absolute
    value
  • Most of the loadings for any given variable are
    small, with ideally only one loading being large
    in absolute value.
  • Any pair of factors has dissimilar patterns of
    loadings.

22
The matrix representation of factor analysis
reveals its close ties to principal components
analysis as well as its key point of departure.
23
The solution to the common factor model begins in
same way that principal components analysis
began, with the singular value decomposition of
the standardized data matrix, X
The correlation matrix among the measures in X
can be defined in matrix form as
24
We can substitute in the definition of X in terms
of its component matrices
25
We can rearrange terms because (AB) (BA) and
(D½) D½. We can further simplify because the
correlation matrix for Zs is an identity matrix
26
This form is a reminder that the correlation
matrix among the original measures, which is a
variance-covariance matrix for standardized
scores, can be obtained by applying the weights
(U eigenvectors) for creating the linear
combinations to the variance-covariance matrix
for those linear combinations (D, with
eigenvalues on the diagonal).
27
Another way to represent R is in terms of the
factor loadings, which are just rescaled
eigenvectors
28
Factor analysis also tries to approximate the
correlation matrix, R, but does not attempt to
decompose all of X. Instead, only the common
factor variance is of interest. This means that
rather than substituting ZsD½U for X into the
formula for R, we must instead substitute X as
defined by the common factor model.
29
In matrix form, the common factor model can be
represented as
30
In the common factor model, we make the following
assumptions
The common factors are uncorrelated.
The specific factors have a diagonal covariance
matrix.
The common factors and specific factors are
uncorrelated.
31
Substituting the common factor definitions for X
An identity matrix
Expected to be zero
Expected to be zero
The matrix that is reproduced by factor analysis
is the correlation matrix less the specific
factor variances.
32
We may not find the original location of the
factors to provide an easy interpretation
The factors can be rotated to a new position
Where T is matrix of direction cosines that
indicate the location of the new axes compared to
the original axes.
33
The location of the new axes is determined by
simple structure, which ideally might look like
this for the factor loadings
34
One way to approach this ideal pattern is to find
the rotation that maximizes the variance of the
loadings in the columns of the factor structure
matrix. This approach was suggested by Kaiser and
is called varimax rotation.
35
A second way to approach this ideal pattern is to
find the rotation that maximizes the variance of
the loadings in the rows of the factor structure
matrix. This approach is called quartimax
rotation.
36
For those who want the best of both worlds,
equimax rotation attempts to satisfy both goals.
Varimax is the most commonly used and the three
rarely produce results that are very discrepant.
37
Huba et al. (1981) collected data on drug use
reported by 1634 students (7th to 9th grade) in
Los Angeles. Participants rated their use on a
5-point scale 1 never tried, 2 only once, 3
a few times, 4 many times, 5 regularly.
38
The analysis begins in the same way as principal
components analysis. It would make little sense
to search for common factors in an identity
matrix
39
Unlike principal components analysis, factor
analysis will not attempt to explain all of the
variance in each variable. Only common factor
variance is of interest. This creates the need
for some initial estimates of communalities.
40
The number of factors to extract is guided by the
size of the eigenvalues, as it was in principal
components analysis. But, not all of the variance
can be accounted for in the variables.
41
To the extent there is random error in the
measures, the eigenvalues for factor analysis
will be smaller than the corresponding
eigenvalues in principal components analysis.
42
The location of the factors might be rotated to a
position that allows easier interpretation. This
will shift the variance, but preserve the total
amount accounted for.
43
Two factors appear to be sufficient. The
attenuated eigenvalues will generally tell the
same story, but the comparison is against a value
of 0. Why?
44
The loadings will be reduced in factor analysis
because not all of the variance in X is due to
common factor variance.
45
As in principal components analysis we can judge
the quality of the solution by examining the
reproduced correlation matrix.
46
Rotating the factors to simple structure makes
the interpretation easier. The first factor
appears to be minor recreational drug use. The
second factor appears to be major abusive drug
use.
47
(No Transcript)
48
Scores on the underlying common factors can be
obtained. The key difference compared to
principal components analysis is that variables
are assumed to be measured with error in factor
analysis.
These are often referred to as regression-based
factor scores.
49
Nonetheless, there are parallels in the way
scores are derived in factor analysis and the way
they are derived in principal components analysis.
50
(No Transcript)
51
A sample of 303 MBA students were asked to
evaluate different makes of cars using 16
different adjectives rated on a 5-point agreement
scale (1 strongly disagree, 5 strongly
agree) This car is an exciting car.
52
Are all 16 individual ratings required to
understand product evaluation, or, is there a
simpler measurement model?
53
The analysis must begin with some idea of the
proportion of variance in each variable that can
be attributed to the common factors. The most
common initial estimate is the squared multiple
correlation between a given measure and all the
remaining measures.
54
(No Transcript)
55
Three common factors appear to underlie the 16
evaluative ratings.
56
The three common factors account for two-thirds
of the common factor variance.
57
The adequacy of the three-factor model can be
judged in part by examining the reproduced
correlation matrix and the residual correlation
matrix. Note the generally small values in the
residual matrix.
58
Without rotation, the interpretation of the
factors is not readily apparent, especially the
second and third factors.
59
The varimax rotation makes the interpretation a
bit easier. What would you name these factors?
Factor 1?
Factor 2?
Factor 3?
60
The transformation matrix (T) moves the original
factor axes to a new position defined by simple
structure (varimax criterion in this case).
TT1
These values are the direction cosines that
relate the original orientation of the axes to
the position of the new axes.
61
(No Transcript)
62
(No Transcript)
63
(No Transcript)
64
(No Transcript)
65
It is rare for the different rotational criteria
to produce different results
66
(No Transcript)
67
Scores for the three factors can potentially make
other analyses easier to interpret. In the same
study, 10 different makes of automobile were the
targets of the ratings. On a variable by variable
basis, the information can become overwhelming,
with patterns and consistencies getting lost amid
the details.
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
(No Transcript)
76
(No Transcript)
77
(No Transcript)
78
(No Transcript)
79
(No Transcript)
80
(No Transcript)
81
(No Transcript)
82
(No Transcript)
83
(No Transcript)
84
(No Transcript)
85
(No Transcript)
86
All of the measures produce significant
differences. An analysis of the factor scores has
the potential to bring any patterns in these
differences into sharp relief.
87
(No Transcript)
88
(No Transcript)
89
(No Transcript)
90
The 16 overlapping significance tests for the
individual measures are reduced to three,
nonredundant tests.
91
Would we change any of our factor definitions in
light of the differences?
How might data such as these be used for some
practical end?
92
Advocates of factor analysis often claim that it
is inappropriate to apply principal components
procedures in the search for meaning or latent
constructs. But, does it really matter all that
much? To the extent that the communalities for
all variables are high, the two procedures should
give very similar results. When the commonalities
are very low, then factor analysis results may
depart from principal components.
93
(No Transcript)
94
(No Transcript)
95
(No Transcript)
96
  • Factor analysis offers a more realistic model of
    measurement by admitting the presence of random
    error and specific, systematic sources of
    variability.
  • Another way to make the model more realistic is
    to relax the restriction that factors be
    orthogonal. Allowing oblique factors has two
    potential benefits
  • It allows the model to better match the
    actual data
  • It allows the possibility of higher order
    factors

97
Oblique rotation relaxes the requirement that
factors be independent. This requires the
addition of a new matrixthe factor pattern
matrix. One way to appreciate why this matrix is
necessary is to remember that when factors are
orthogonal, the variables can be reconstructed
from those factors using a very simple linear
combination of the standardized factor scores
98
This linear combination resembles a multiple
regression equation, but because the predictors
are independent, the regression coefficients or
weights are simply correlations---just the
elements of the factor loading (structure)
matrix. When the factors are no longer
orthogonal, then the correlations are not the
appropriate weights and a separate matrix
containing those weights is necessary.
99
The factor pattern matrix contains the
regression weights for reconstructing the
variable from the factors. These weights
appropriately take into account the correlations
among the factors (the predictors) and indicate
the unique role played by any one of the
predictors in reconstructing a given variable.
100
x2
Perpendicular projections onto the axes are the
correlationsthe elements of the factor structure
matrix.
v
x1
x1
Parallel projections onto the axes are the
pattern coefficientsthe elements of the factor
pattern matrix.
101
Because the factor pattern matrix represents the
unique contribution of each factor to the
reconstruction of any variable in X, it provides
a better basis for judging simple
structure. Available techniques for achieving
simple structure for oblique rotation (e.g.,
promax, direct oblimin) confront an additional
problem---the specification for the amount of
correlation among the factors. Unlike orthogonal
rotation in which these correlations are fixed at
0, in oblique rotation, simple structure can be
sought for any correlations among the factors.
Default values in software can be manipulated to
search for the optimal combination.
102
Hypothetical data (N 500) were created for
individuals completing a 12-section test of
mental abilities. All variables are in standard
form.
103
The correlation matrix is not an identity matrix
104
The scree test clearly shows the presence of
three factors
105
On average, the three factors extracted can
accounted for about half of the variance in the
individual subtests.
106
Factor analysis accounts for less variance than
principal components and rotation shifts the
variance accounted for by the factors.
Why does the sum of the squared rotated loadings
not equal the sum of the squared unrotated
loadings?
107
The initial extraction . . .
108
Oblique rotation is much clearer in the pattern
matrix than in the structure matrix
109
The correlations among the factors suggest the
presence of a higher order factor
Why are some of the correlations negative when
one would expect all mental abilities to be
positively correlated?
110
An alternative oblique rotation---Promax---provide
s much the same answer
The order of the factors may vary, and one gets
reflected, but the essential interpretation is
the same.
111
The correlations among the factors are similar
for the two procedures.
112
The factor correlations can also be factor
analyzed, producing a higher-order factor
analysis
Matrix data Variables rowtype_ Verbal Math
Analytic. Begin data N 500 500 500 corr
1.000 corr 0.376 1.000 corr 0.448 0.402
1.000 end data.
113
FACTOR matrix IN(cor) /VARIABLES Verbal Math
Analytic /MISSING LISTWISE /ANALYSIS Verbal
Math Analytic /PRINT UNIVARIATE INITIAL
CORRELATION SIG DET KMO INV REPR AIC
EXTRACTION /PLOT EIGEN /CRITERIA kaiser
/EXTRACTION PAF /ROTATION varimax
/METHODCORRELATION .
114
The factor intercorrelations are not an identity
matrix
115
A single factor appears appropriate
116
The single factor can account for about 40 of
the variance in the first-order factors
117
Note the substantial difference in the principal
components and factor analysis results.
118
Each first-order factor loads highly on the
second-order factor.
119
General Mental Abilities
Verbal
Analytic
Math
d1
d2
d3
d4
d9
d10
d11
d12
d6
d5
d7
d8
120
The factor pattern matrix relating the higher
order factor(s) to the original variables can be
found by Pvh PvfPfh
121
Factor analysis can easily capitalize on chance,
making it highly desirable to cross-validate the
results. The stability of factor analysis
estimates in a single sample can be gauged
through methods such the jackknife and bootstrap
techniques. A more convincing cross-validation
occurs when the results are replicated in a new
sample. When a second sample is available, an
especially convincing approach is to calculate
the factor scores in each sample using the data
from that sample, but the factor score weight
matrix from the other sample. Correlations among
corresponding factor scores within samples should
be high if the factor solutions are not due to
chance.
122
The mental abilities data set was split into two
subsamples
Sample 1 Sample 2
123
Sample 1 Sample 2
124
Similar underlying structures are suggested in
the two samples
Sample 1 Sample 2
125
Generally similar patterns to the communalities
Sample 1 Sample 2
126
Sample 1
Sample 2
127
Same unrotated factors emerge. The difference in
order is not important
Sample 1 Sample 2
128
Simple structure is the same in the two samples.
One simple test of stability across the samples
would be to correlate the columns of these
matrices.
Sample 1 Sample 2
129
The factor score coefficient matrices are similar
Sample 1 Sample 2
130
Why arent these identity matrices?
Sample 1 Sample 2
131
In each sample, actual factor scores are computed
(using the weight matrix from that sample). In
addition, estimated factor scores are calculated
using the weight matrix from the other sample.
Sample 1 Sample 2
132
Sample 1
133
Sample 2
134
Factor analysis is also sometimes used as a way
to cluster individuals. In fact, there is nothing
in the procedure that prevents any data matrix
from being analyzed. A very common approach is to
transpose a People x Variables matrix and to
factor analyze the resulting Variables x People
matrix. This approach then finds the linear
combinations of people that can best account for
the variability in the matrix, resulting in
clusters of people that have similar profiles on
the variables.
135
A sample of medical students (N 323) completed
a brief questionnaire about their attitudes
regarding overweight and obese people to
determine the degree of anti-fat bias that might
exist in this sample. The questions were rated on
a 7-point agreement scale.
136
  • One component of anti-fat bias is a strong
    dislike of the obese
  • I really dont like fat people much.
  • I dont have many friends that are fat.
  • I tend to think that people who are overweight
    are a little untrustworthy.
  • Although some fat people are surely smart, in
    general, I think they tend not to be quite as
    bright as normal weight people.
  • I have a hard time taking fat people too
    seriously.
  • Fat people make me feel somewhat
    uncomfortable.
  • If I were an employer looking to hire, I might
    avoid hiring a fat person.

137
  • A second component of anti-fat bias is a strong
    fear of becoming fat
  • I feel disgusted with myself when I gain
    weight.
  • One of the worst things that could happen to
    me would be if I gained 25 pounds.
  • I worry about becoming fat.

138
  • A third component of anti-fat bias is the belief
    that the underlying cause of obesity is a lack of
    willpower
  • People who weight too much could lose at least
    some part of their weight through a little
    exercise.
  • Some people are fat because they have no
    willpower.
  • Fat people tend to be fat pretty much through
    their own fault.

139
We could approach this in the standard way. Three
factors appear to underlie the data
140
The communalities vary quite a bit, but on
average about 50 of the item variability can be
accounted for by three factors
141
Note that one of the retained factors has an
eigenvalue less than 1.00
142
The orthogonal rotation suggests that a
three-factor solution might be correct.
143
It seems possible that the three facets of
anti-fat bias might be correlated. An oblique
rotation confirms this
144
Another way to approach the same data is to
search for clusters of people who have similar
profiles on the 13 questions. This form of factor
analysis is called a Q-type analysis (rather than
the usual R-type analysis). For the analysis to
make sense, the matrix must be doubly
standardized. The variables are standardized to
remove any differences in scale. Then the matrix
is transposed and the people are standardized as
part of the factor analysis (in the formation of
a correlation matrix).
145
GET FILE'C\Research\Antifat Bias\antifat
transposed and standardized.sav'. FLIP
VARIABLESzq1 zq2 zq3 zq4 zq5 zq6 zq7 zq8 zq9
zq10 zq11 zq12 zq13 . FLIP performed on 323
cases and 13 variables, creating 13 cases and 324
variables. The working file has been replaced.
146
FACTOR /VARIABLES var001 var002 var003 var004
var005 var006 var007 var008 var009 var010
var011 var012 var013 var014 var015 var016
var017 var018 var019 var020 /MISSING LISTWISE
/ANALYSIS var001 var002 var003 var004 var005
var006 var007 var008 var009 var010 var011 var012
var013 var014 var015 var016 var017 var018
var019 var020 /PRINT INITIAL EXTRACTION
ROTATION /PLOT EIGEN /CRITERIA FACTORS(3)
ITERATE(25) /EXTRACTION PC /CRITERIA
ITERATE(25) /ROTATION VARIMAX
/METHODCORRELATION .
147
Factor analysis always works best when there are
many more sampling units than elements being
correlated. That usually means many more people
than variables. Here it means we need to limit
the number of people in the analysis. Well focus
on just the first 20 people (even this is
riskysimilar to using multiple regression with
more predictors than people).
148
Looks like there might be three dimensions or
clusters of people, but the scree is not at all
clear.
149
(No Transcript)
150
(No Transcript)
151
The loadings for people can help identify those
that have similar profiles on the variables.
Ordinarily, the interpretations are assisted by
other information that might identify why
clusters of people have similar profiles. The
clusters might then be used in other analyses.
152
  • Other issues in factor analysis
  • How many people? (5, 10, 20, 100, 200, 400)
  • The number of variables per factor (4 to 6)
  • Factoring items versus factoring scales
  • Assumptions and distributions
  • Over-factoring versus under-factoring
  • PCA versus FA
  • Reliability for factor score composites

153
Exploratory factor analytic methods are sometimes
used as a crude way to confirm hypotheses about
the latent structure underlying a data set. As a
first pass, these methods do just fine. But, more
powerful confirmatory factor analytic procedures
exist that can better address questions about
data that are strongly informed by theory.
154
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com