What is Factor Analysis? - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

What is Factor Analysis?

Description:

It explains the inter-relationships among a large ... Scree test criterion [it plots eigenvalues of each factor in terms of the number ... Use Scree graph. ... – PowerPoint PPT presentation

Number of Views:233
Avg rating:3.0/5.0
Slides: 37
Provided by: ramalingam6
Category:
Tags: analysis | factor | scree

less

Transcript and Presenter's Notes

Title: What is Factor Analysis?


1
What is Factor Analysis?
  • It explains the inter-relationships among a large
    number of variables in terms of their common
    underlying factors.
  • It is a data reduction technique.
  • Use it when you want to obtain the underlying
    structure of the inter-relationships.

2
Seven-Stages operations of Factor Analysis.
  • 1. Define the objectives
  • Identification of structural relationships among
    variables.
  • Obtain representative variables.
  • Create a new set of variables.

3
Stage 2
  • Designing Factor Analysis
  • Create the correlation matrix among variables
    using
  • Original data matrix called R type factor
    analysis
  • Create the input data matrix from the
    correlations among sample units This is called
    Q type factor analysis.
  • The ratio sampled units/variables should be
    greater than 5 for factor analysis.

4
Stage 3
  • Check out the validity of assumptions
  • Variables should be related to have meaningful
    common factors.
  • Sample units must be homogenous.
  • Data must be metric. Dummy variables are allowed.
  • Multivariate normality is required only for
    doing hypotheses testing.
  • The correlations among variables must be at least
    0.3 in their absolute values.

5
Stage 4
  • Select one of two extraction methods
  • Principal component factor analysis it uses a
    significant amount of explained variations.
  • Common factor analysis it uses the common
    variance and places communality estimates on the
    diagonal of the correlation matrix.
  • Determine how many factors to consider using
  • Latent root criterion by specifying the threshold
    values for eigenvalues
  • Percentage of variation to be explained
  • Scree test criterion it plots eigenvalues of
    each factor in terms of the number of factors.
    The point where the plot becomes flat is the
    appropriate number of factors.

6
Stage 5
  • The rotation redistributes the variance to
    explain well the simple structures in the factor
    matrix.
  • There two rotation methods
  • Orthogonal rotations called quartimax to cluster
    the sampled units, varimax to cluster the
    variables, equimax to compromise between
    varimax and quartimax.
  • Oblique rotation methods
  • SAS uses promax
  • SPSS uses Oblimin
  • Factor loadings should be at least 0.30 in
    absolute value.

7
Stage 6.
  • Validating the results
  • Use confirmatory factor analysis to check out the
    replications
  • Factor structure should be stable
  • Identify the impact of outliers by analyzing with
    and without outliers

8
Stage 7
  • Interpret the results
  • Use the knowledge in future studies
  • Develop surrogate variables as a simple summary
    of several correlated variables.

9
Factor Analysis
  • Principal components
  • An extension of regression ideas to determine the
    relationship among variables
  • Multivariate method
  • Factor analysis details
  • Use when some variables are observed and others
    are latent
  • Multivariate method

10
Some Facts of Principal Components
  • We can rotate the x and y axes so that highest
    variability in the data occurs in the first
    principal axis and the next high variability
    occurs in the second principal axis which is
    orthogonal to the first axis.
  • The new data values are x and y which are
    linear combinations of old x and y values.
  • Find standard deviation of x and y for a
    selected angle of rotation.
  • The graph of such standard deviation curve helps
    to identify the optimal angle.

11
Basics
  • First principal component is

12
Variability explained
13
Variability of k-th Principal Component
14
Percent of variability explained by the k-th
Principal Component
15
Distributionality
16
Confidence Interval
17
Correlations
18
Correlation matrix
19
Rationale in examining correlation matrix above
  • By examining the correlations of the variables
    with a principal component, we can find the
    variable which contributes most to a principal
    component as its correlation is the highest.
  • By searching for the highest correlation among
    the correlations of a variable with the principal
    components, we know which variable causes high
    overall variability in the data.

20
Factor Analysis
  • This is also a regression technique of finding
    relations among observed and latent
    non-observable variables using their
    correlation structure.
  • The unobserved variables are called factors.

21
Factor model
22
Assumptions of the Factor Model
  • Factors are standardized to have zero mean and
    unit variance.
  • Factors are uncorrelated with each other and also
    with the noise.
  • The noises have zero mean and uncorrelated with
    each other and the unobserved factors and may
    have different variability.
  • It is important that the number of factors, k is
    less than the number of observations, p.

23
Assumptions of Noises
24
Factor Loadings
25
Assumptions of Observed
26
Residual Correlation
  • It is the difference between observed correlation
    and fitted correlation in the factor analysis
  • A Rule when the residual correlation is less
    than 0.1 in absolute value, the correlation has
    been well explained.
  • When the residual correlations are significantly
    large, consider including more factors.

27
An Interpretation
  • The sum of the squares of the loadings for a
    factor is the proportion of the total observed
    variances (in the data of X ) that is explained
    by the factors.

28
Estimation methods
  • Unknowns to be estimated are factor loadings and
    the specificities.
  • Methods of estimation
  • MLE gives best fit
  • Varimax method not to be used when general
    factors are there
  • Quartimax method use when one factor has large
    loadings and there are not many factors of that
    type

29
Oblique factors
  • When two factors are NOT independent, they are
    called oblique factors.

30
How many factors to be considered?
  • There is NO universal rule.
  • When the residual correlations are significant,
    consider more factors
  • Do test of significance as in the next two
    slides.
  • The number of factors to be considered should be
    more than the number vs greater than or equal to
    one.
  • Use Scree graph. The number of values above the
    base line in the graph is the number of factors
    to be considered.

31
Hypothesis Testing method
32
Idea of finding the number of factors
  • 1. Varimax method uses the idea of maximizing the
    sum of the variances of the square of the
    loadings.
  • 2. Quartimax method uses the idea of maximum
    possible loadings.

33
Some Notations
34
Factors in Varimax Method

35
factors to be considered in Quartimax Method
36
An Example
Write a Comment
User Comments (0)
About PowerShow.com