Title: Data Analysis with SPSS: Introducing Exploratory Factor Analysis
1Data Analysis with SPSSIntroducing Exploratory
Factor Analysis
- Bidin Yatim
- Phd (2005,Exeter)
- MSc (1984, Aston)
- BSc (1982, Nottingham)
- Department of Statistics
- Faculty of Quantitative Sciences
Topic 9
2Exploratory Factor Analysis Introduction
- Factor analysis attempts to identify underlying
variables, or factors, that explain the pattern
of correlations within a set of observed
variables. Factor analysis is often used in data
reduction to identify a small number of factors
that explain most of the variance observed in a
much larger number of manifest variables.
3Exploratory Factor Analysis Introduction
- Statistical technique for dealing with
interdependencies between multiple variables ie
if variables are interrelated without designating
some dependent and others independent - Many variables reduced (grouped) into a smaller
number of factors (Dimension reduction method) - To accomplish the same objective as PCA/ MDS.
4The factor analysis procedure offers a high
degree of flexibility
- Seven methods of factor extraction.
- Five methods of rotation
- Three methods of computing factor scores scores
can be saved as variables for further analysis.
5Alternative Methods of Factor Extraction
- Principal Component Analysis
- Maximum likelihood method
- Principal axis
- Image
- Alpha
- Generalized least squares
- Unweighted least squares
6Factor Analysis Extraction
- Method to specify the method of factor
extraction - Principal Components Analysis to form
uncorrelated linear combinations of the observed
variables. The first component has maximum
variance. Successive components explain
progressively smaller portions of the variance
and are all uncorrelated with each other. It is
used to obtain the initial factor solution. It
can be used when a correlation matrix is
singular. - Unweighted Least-Squares Method minimizes the
sum of the squared differences between the
observed and reproduced correlation matrices
ignoring the diagonals. - Generalized Least-Squares Method minimizes the
sum of the squared differences between the
observed and reproduced correlation matrices.
Correlations are weighted by the inverse of their
uniqueness, so that variables with high
uniqueness are given less weight than those with
low uniqueness. - Maximum-Likelihood Method produces parameter
estimates that are most likely to have produced
the observed correlation matrix if the sample is
from a multivariate normal distribution. The
correlations are weighted by the inverse of the
uniqueness of the variables, and an iterative
algorithm is employed. - Principal Axis Factoring extracts factors from
the original correlation matrix with squared
multiple correlation coefficients placed in the
diagonal as initial estimates of the
communalities. These factor loadings are used to
estimate new communalities that replace the old
communality estimates in the diagonal. Iterations
continue until the changes in the communalities
from one iteration to the next satisfy the
convergence criterion for extraction. - Alpha considers the variables in the analysis to
be a sample from the universe of potential
variables. It maximizes the alpha reliability of
the factors. - Image Factoring developed by Guttman and based
on image theory. The common part of the variable,
called the partial image, is defined as its
linear regression on remaining variables, rather
than a function of hypothetical factors. - Analyze to specify either a correlation matrix
or a covariance matrix. - Extract can either retain all factors whose
eigenvalues exceed a specified value or retain a
specific number of factors. - Display. to request the unrotated factor
solution and a scree plot of the eigenvalues. - Scree plot A plot of the variance associated
with each factor. Used to determine how many
factors should be kept. Typically the plot shows
a distinct break between the steep slope of the
large factors and the gradual trailing of the
rest (the scree).
7Factor Analysis Rotation
- Method Allows you to select the method of factor
rotation. - Orthogonal
- Varimax minimizes number of variables with high
loadings on a factor - Quartimax Method minimizes the number of
factors needed to explain each variable. It
simplifies the interpretation of the observed
variables. - Equamax Method combination of the varimax
method, which simplifies the factors, and the
quartimax method, which simplifies the variables.
The number of variables that load highly on a
factor and the number of factors needed to
explain a variable are minimized. - Oblique (nonorthogonal)
- Direct Oblimin Method When delta equals 0 (the
default), solutions are most oblique. As delta
becomes more negative, the factors become less
oblique. To override the default delta of 0,
enter a number less than or equal to 0.8. - Promax Rotation Allows factors to be
correlated. It can be calculated more quickly
than a direct oblimin rotation, so it is useful
for large datasets. - Display Allows to include output on the rotated
solution, as well as loading plots for the first
two or three factors. - Factor Loading Plot Three-dimensional factor
loading plot of the first three factors. For a
two-factor solution, a two-dimensional plot is
shown. The plot is not displayed if only one
factor is extracted. Plots display rotated
solutions if rotation is requested.
8Factor Analysis Scores
- Save as variables Creates one new variable for
each factor in the final solution using- - Bartlett Scores The scores produced have a mean
of 0. The sum of squares of the unique factors
over the range of variables is minimized. - Anderson-Rubin Method modification of the
Bartlett method which ensures orthogonality of
the estimated factors. The scores produced have a
mean of 0, a standard deviation of 1, and are
uncorrelated. - Display factor score coefficient matrix Shows
the coefficients by which variables are
multiplied to obtain factor scores. Also shows
the correlations between factor scores.
9Types of Factor Analysis
10Uses of Factor Analysis
- Instrument Development
- Theory Development
- Data Reduction
- Model Testing
- Comparing Models
11Example
- What underlying attitudes lead people to
respond to the questions on a political survey as
they do? - Examining the correlations among the survey
items reveals that there is significant overlap
among various subgroups of items--questions about
taxes tend to correlate with each other,
questions about military issues correlate with
each other, and so on. - With factor analysis, you can investigate the
number of underlying factors and, in many cases,
you can identify what the factors represent
conceptually. Additionally, you can compute
factor scores for each respondent, which can then
be used in subsequent analyses. For example, you
might build a logistic regression model to
predict voting behavior based on factor scores.
12Assumptions
- Interval/ ratio level data
- Bivariate normal distribution for each pair of
variables and observations should be independent.
- Linear relationships
- Substantial correlations among variables (can be
tested using Bartletts sphericity test) - Categorical data (such as religion or country of
origin) are not suitable for factor analysis.
Data for which Pearson correlation coefficients
can sensibly be calculated should be suitable for
factor analysis.
13Assumptions
- The factor analysis model specifies that
variables are determined by common factors (the
factors estimated by the model) and unique
factors (which do not overlap between observed
variables) the computed estimates are based on
the assumption that all unique factors are
uncorrelated with each other and with the common
factors.
14Sample Size
- 10 subjects per variable. To some, subject to
variable ratio (STV) should at least be 51. - Every analysis should have 100 to 200 subjects
15Steps
- Obtain correlation matrix for the data.
- Apply EFA
- Decide on the number of factors/ components to be
retained. - Interpreting the factors/components. Use rotation
if necessary - Obtain factor score for further analysis
16Two concepts about variables crucial in
understanding EFA
- Common factor
- a hypothetical construct that affects at least
two of our measurement variables - We want to estimate the common factors that
contribute to the variance in our variables. - Unique variance
- factor that contributes to the variance in only
one variable. - Only one unique factor for each variable.
- Unique factors are unrelated to one another and
unrelated to the common factors. - Want to exclude these unique factors from our
solution.
17Two concepts about variables crucial in
understanding EFA
- Communalities (h2)1-uniqueness - the sum over
all factors of the squared factor loading for a
variable, it indicates the portion of variance of
the variable that is accounted for by the set of
factors (or in which a variable has in common
with the other variables in the analysis) Small
numbers indicate lack of shared variance. - Uniquenessspecific variance error variance, is
that portion of the total variance that is
unrelated to other variables
18Total Variance
- Total Variance Error variance Common Variance
Specific Variance - NOTE If we have 10 original variables, and the
variables are standardized, total variance 10.
19Eigenvalue
- Indicates the portion of the total variance of a
correlation matrix that is explained by a factor
20Iterated Principal Factors Analysis
- The most common type of FA.
- Also known as principal axis FA.
- We eliminate the unique variance by replacing, on
the main diagonal of the correlation matrix, 1s
with estimates of communalities. - Initial estimate of communality R2 between one
variable and all others.
21Lets Do It AnalyzegtData ReductiongtFactorgtExtract
ion
- Using the CerealFA data, change the extraction
method to principal axis.
22SPSS - Factor AnalysisOptions
- Missing values
- Exclude cases listwise
- Exclude cases pairwise
- Replace with mean
- Coefficient display format
- Sorted by size
- suppress absolute values less than .10
23Correlation Matrix
- Examine matrix
- Correlations should be .30 or higher
- Kaiser-Meyer-Olkin (KMO) Measure of Sampling
Adequacy - Bartlett's Test of Sphericity
24Correlation Matrix
- Bartlett's Test of Sphericity
- Tests hypothesis that correlation matrix is an
identity matrix. - Diagonals are ones
- Off-diagonals are zeros
- Significant result indicates matrix is not an
identity matrix, therefore EFA can be used.
25Correlation Matrix
- Kaiser-Meyer Olkin (KMO)
- measure of sampling adequacy.
- index for comparing magnitudes of observed
correlation coefficients to magnitudes of partial
correlation coefficients - small values indicate correlations between pairs
of variables cannot be explained by other
variables
26Kaiser-Meyer-Olkin (KMO)
- Marvelous - - - - - - .90s
- Meritorious - - - - - .80s
- Middling - - - - - - - .70s
- Mediocre - - - - - - - .60s
- Miserable - - - - - - .50s
- Unacceptable - - - below .50
27Look at the KMO and Bartletts test
- Bartletts test of sphericity is significant i.e.
null hypothesis that correlation matrix is an
identity is rejected. - Kaiser-Meyer-Olkin measure of sampling adequacy
is gt0.8, meritorious. - Factor analysis is
appropriate
28Look at the Initial Communalities
- They sum to
- We have eliminated 25 units of unique
variance.
29Iterate!
- Using the estimated communalities, obtain a
solution. - Take the communalities from the first solution
and insert them into the main diagonal of the
correlation matrix. - Solve again.
- Take communalities from this second solution and
insert into correlation matrix.
30Solve again
- Repeat this, over and over, until the changes in
communalities from one iteration to the next are
trivial. - Our final communalities sum to .
- After excluding units of unique variance, we
have extracted units of common variance. - That is / 25 of the total variance in our
25 variables.
31How many factor to retain?
32Criteria For Retention Of Factors
- Eigenvalue greater than 1
- Single variable has variance equal to 1
- Plot of total variance - Scree plot
- Gradual trailing off of variance accounted for is
called the scree. - Note cumulative of variance of rotated factors
33We have packaged those 58.05 into 4 factors
34Before rotation
35Rotated factor loading
36Rotation produces
- Factor Pattern Matrix
- High low factor loadings are more apparent
- generally used for interpretation
- Factor Structure Matrix
- correlations between factors and variables
37Interpretation of Rotated Matrix
- Loadings of .40 or higher
- Name each factor based on 3 or 4 variables with
highest loadings. - Do not expect perfect conceptual fit of all
variables.
38- SPSS will not only give you the scoring
coefficients, but also compute the estimated
factor scores for you. - In the Factor Analysis window, click Scores and
select Save As Variables, Regression, Display
Factor Score Coefficient Matrix.
39Here are the scoring coefficients.Look back at
the data sheet and you will see the estimated
factor scores.
40Use the Factor Scores
- In multiple regression
- independent t to compare groups on mean factor
scores. - Or even in ANOVA
41Required Number of Subjects and Variables
- Rules of Thumb (not very useful)
- 100 or more subjects.
- at least 10 times as many subjects as you have
variables. - as many subjects as you can, the more the better.
42- Start out with at least 6 variables per expected
factor. - Each factor should have at least 3 variables that
load well. - If loadings are low, need at least 10 variables
per factor. - Need at least as many subjects as variables. The
more of each, the better. - When there are overlapping factors (variables
loading well on more than one factor), need more
subjects than when structure is simple.
43- If communalities are low, need more subjects.
- If communalities are high (gt .6), you can get by
with fewer than 100 subjects. - With moderate communalities (.5), need 100-200
subjects. - With low communalities and only 3-4 high loadings
per factor, need over 300 subjects. - With low communalities and poorly defined
factors, need over 500 subjects.
44What I Have Not Covered Today
- LOTS.
- For a general introduction to measurement
(reliability and validity), see
http//core.ecu.edu/psyc/wuenschk/docs2210/Researc
h-3-Measurement.doc
45Multivariate Analysis Summary
- Multivariate analysis is hard, but useful if it
is important to extract as much information from
the data as possible. - For classification problems, the common methods
provide different approximations to the Bayes
discriminant. - There is considerably empirical evidence that, as
yet, no uniformly most powerful method exists.
Therefore, be wary of claims to the contrary!
46Further reading
- Hair, Anderson, Tatham Black, (HATB)
Multivariate Data Analysis, 5th edn
47Thats All Friends
- See U Again
- Some Other Days
- Have a Nice Time With SPSS