Common Factors versus Components: Principals and Principles, Errors and Misconceptions Keith F' Wida - PowerPoint PPT Presentation

About This Presentation
Title:

Common Factors versus Components: Principals and Principles, Errors and Misconceptions Keith F' Wida

Description:

Published in Psychometrika ... Recent paper in Psychometrika (Ogasawara, 2003) Based work on oblique factors & components with: ... – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 92
Provided by: keithfw
Category:

less

Transcript and Presenter's Notes

Title: Common Factors versus Components: Principals and Principles, Errors and Misconceptions Keith F' Wida


1
Common Factors versus ComponentsPrincipals and
Principles,Errors and MisconceptionsKeith F.
Widaman University of California at Davis
  • Presented at conference Factor Analysis at 100
  • L. L. Thurstone Psychometric Lab, University of
    North Carolina at Chapel Hill, May 2004

2
Goal of the Talk
  • Flip rendition
  • (With apologies to Will) I come not to praise
    principal components, but to bury them
  • Thus, we might inter the procedure beside its
    creator
  • More serious
  • To outline several key assumptions, usually
    implicit, of the simpler principal components
    approach
  • Compare and contrast common factor analysis and
    principal component analysis

3
Organization of the Talk
  • Principals
  • Major figures/events
  • Important dimensions factors/components
  • Principles
  • To organize our thinking
  • Lead to methods to evaluate procedures
  • Errors
  • Structures of residuals
  • Unclear presentations
  • Misconceptions

4
Principal Individuals Contributions
  • Spearman (1904)
  • First conceptualization of the nature of a common
    factor the element in common to two or more
    indicators (preferably three or more)
  • Stressed presence of two classes of factors
  • general (with one member) and
  • specific (with a potentially infinite number)
  • Key Based evaluation of empirical evidence on
    the tetrad difference criterion (i.e., on
    patterns in correlations among manifest
    variables) with no consideration of diagonal

5
Principal Individuals Contributions
  • Thomson (1916)
  • Early recognition of elusiveness of theory data
    connection
  • Single common factor implies hierarchical pattern
    of correlations, but so does an opposite
    conceptualization
  • Key for this talk Focus was still on the
    patterns displayed by off-diagonal correlation.
    Diagonal elements were of no interest or
    importance

6
Principal Individuals Contributions
  • Thurstone (1931)
  • First foray into factor analysis
  • Devised a center of gravity method for
    estimation of loadings
  • Led to centroid method
  • Key Again, diagonal values explicitly
    disregarded

7
Principal Individuals Contributions
  • Hotelling (1933)
  • Proposed method of principal components
  • Method of estimation
  • Least squares
  • Decomposition of all of the variance of the
    manifest variables into dimensions that are
  • (a) orthogonal
  • (b) conditionally variance maximized
  • Key 1 Left unities on diagonal
  • Key 2 Interpreted unrotated solution

8
Principal Individuals Contributions
  • Thurstone (1935) The Vectors of Mind
  • It is a fundamental criterion for a valid method
    of isolating primary abilities that the weights
    of the primary abilities for a test must remain
    invariant when it is moved from one test battery
    to another test battery.
  • If this criterion is not fulfilled, the
    psychological description of a test will
    evidently be as variable as the arbitrarily
    chosen batteries into which the test may be
    placed. Under such conditions no stable
    identification of primary mental abilities can be
    expected.

9
Principal Individuals Contributions
  • Thurstone (1935)
  • This implies invariant factorial description of a
    test (a) across batteries and (b) across
    populations
  • Again, diagonal values explicitly disregarded
  • Developed rationale for necessity for rotation
  • Contra Hotelling
  • Unities on diagonal imply manifest variables
    are perfectly reliably
  • Need for dimensions manifest variables
  • No rotation! This appears, to me, to be the most
    important criticism of Hotelling by Thurstone.

10
Principal Individuals Contributions
  • McCloy, Metheny, Knott (1938)
  • Published in Psychometrika
  • Sought to compare Common FA (Thurstones method)
    vs. Principal Components Analysis (Hotellings
    method)
  • Perhaps the first comparison of the two methods

11
Principal Individuals Contributions
  • Thomson (1939)
  • Clear statement of the differing aims of
  • Common factor analysis to explain the
    off-diagonal correlations among manifest
    variables
  • Principal component analysis to re-represent
    the manifest variables in a mathematically
    efficient manner

12
Principal Individuals Contributions
  • Guttman (1955, 1958)
  • Developed lower bounds for the number of factors
  • Weakest lower bound was number of factors with
    eigenvalues greater than or equal to unity
  • With unities on diagonal
  • With population data
  • Other bounds used other diagonal elements (e.g.,
    strongest lower bound used SMCs), but these did
    not work as well

13
Principal Individuals Contributions
  • Kaiser (1960, 1971)
  • Described the origin of the Little Jiffy
  • Principal components
  • Retain components with eigenvalues gt 1.0
  • Rotate using varimax
  • Later modifications Little Jiffy Mark IV
    offered important improvements, but were not
    followed

14
Principles Mislaid or Forgotten
  • Principle 1 Common factor analysis and principal
    component analysis have different goals à la
    Thomson (1939)
  • Common factor analysis to explain the
    off-diagonal correlations among manifest
    variables
  • Principal component analysis to re-represent
    the original variables in a mathematically
    efficient manner
  • (a) in reduced dimensionality, or
  • (b) using orthogonal, conditionally variance
    maximized way

15
Principles Mislaid or Forgotten
  • Principle 2 Common factor analysis was as much a
    theory of manifest variables as a theory of
    latent variables
  • Spearman doctrine of the indifference of the
    indicator, so any manifest variable was a
    more-or-less good indicator of g
  • Thurstone test ones theory by developing new
    variables as differing mixtures of factors and
    then attempt to verify presumptions
  • Today, focus seems largely on the latent
    variables
  • Forgetting about manifest variables can be
    problematic

16
Principles Mislaid or Forgotten
  • Principle 3 Invariance of the psychological/
    mathematical description of manifest variables is
    a fundamental issue
  • It is a fundamental criterion for a valid method
    of isolating primary abilities that the weights
    of the primary abilities for a test must remain
    invariant when it is moved from one test battery
    to another test battery
  • Much work on measurement factorial invariance
  • But, only similarities between common factors and
    principal components are stressed differences
    are not emphasized

17
Principles Mislaid or Forgotten
  • Principle 4 Know data and model
  • Should know relation between data and model
  • Should know all assumptions (even implicit) of
    model
  • Frequently told
  • information in correlation matrix is difficult to
    discern
  • so, dont look at data
  • run it through FA or PCA
  • interpret the results
  • This is not justifiable!

18
Common FA Principal CA Models
  • Common Factor Analysis
  • R FF U2 PFP U2
  • where
  • R is (p x p) correlation matrix among manifest
    vars
  • F is a (p x k) unrotated factor matrix, with
    loadings of p manifest variables on k factors
  • U2 is a (p x p) matrix (diagonal) of unique
    factor variances
  • P is a (p x k) rotated factor matrix, with
    loadings of p manifest variables on k rotated
    factors
  • F is a (k x k) matrix of covariances among
    factors (may be I, usually diag I)

19
Common FA Principal CA Models
  • Principal Component Analysis
  • R FcFc PcFcPc
  • R FcFc GG PcFcPc GG
  • R FcFc ? PcFcPc ?
  • where
  • Fc, Pc, Fc have same order as like-named
    matrices for CFA, but with c subscript to denote
    PCA
  • G is a (p x p-k) matrix of loadings of p
    manifest variables on the (p-k) discarded
    components
  • ? ( GG) is a (p x p) matrix of covariances
    among residuals

20
Present Day Advice to Practicing Scientist
  • Velicer Jackson (1990) CFA vs. PCA
  • Four practical issues
  • Similarity between solutions
  • Issues related to of dimensions to retain
  • Improper solutions in CFA
  • Differences in computational efficiency
  • Three theoretical issues
  • Factorial indeterminacy in CFA, not PCA
  • CFA can be used in exploratory and confirmatory
    modes, PCA only exploratory
  • CFA is latent procedure, PCA is manifest

21
Present Day Advice to Practicing Scientist
  • Goldberg Digman (1994) and Goldberg Velicer
    (in press) CFA vs. PCA
  • Results from CFA and PCA are so similar that
    differences are unimportant
  • If differences are large, data are not
    well-structured enough for either type of
    analysis
  • Use factor to refer to factors and components
  • Aim is to explain correlations among manifest vars

22
Present Day Quantitative Approaches
  • Recent paper in Psychometrika (Ogasawara, 2003)
  • Based work on oblique factors components with
  • Equal number of indicators per dimension
  • Independent cluster solution
  • Sphericity (equal error variances), hence equal
    factor loadings
  • Derived expression for SEs (standard errors) for
    factor and component loadings and
    intercorrelations
  • SEs for PCA estimates were smaller than those for
    CFA estimates, implying greater stability of
    (i.e., lower variability around) population
    estimates

23
An Apocryphal Example
  • Researcher wanted to develop a new inventory to
    assess three cognitive traits
  • Knew to collect data in at least two initial,
    derivation samples
  • Use exploratory procedures to verify initial, a
    priori hypotheses
  • Then, move on to confirmatory techniques
  • So, Sample 1, N 1600, and 8 manifest variables
  • 3 Components explain 51 of total variance

24
Oblique Components, Sample 1
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • V1 .704 .002 .005 .496
  • V2 .704 .002 .005 .496
  • V3 .704 .002 .005 .496
  • N1 .105 .715 .065 .575
  • N2 .002 .725 .014 .538
  • N3 .116 .670 .089 .417
  • S1 .005 .002 .735 .540
  • S2 .005 .002 .735 .540
  • Fac 1 1.0
  • Fac 2 .256 1.0
  • Fac 3 .147 .147 1.0

25
Orthogonal Components, Sample 1
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • V1 .698 .079 .044 .496
  • V2 .698 .079 .044 .496
  • V3 .698 .079 .044 .496
  • N1 .211 .717 .127 .575
  • N2 .104 .716 .070 .538
  • N3 .025 .643 .045 .417
  • S1 .050 .046 .732 .540
  • S2 .050 .046 .732 .540
  • Fac 1 1.0
  • Fac 2 .000 1.0
  • Fac 3 .000 .000 1.0

26
An Apocryphal Example
  • After confirming a priori hypotheses in Sample 1,
    the researcher collected data from Sample 2
  • Same manifest variables
  • Sampled from the same general population
  • Same mathematical approach principal components
    followed by oblique and orthogonal rotation
  • Got same results!
  • Decided to test the theory in Sample 3 using
    replicate and extend approach
  • Major change Switch to Confirmatory Factor
    Analysis

27
Confirmatory Factor Analysis, Sample 3
  • Variable Fac 1 Fac 2 Fac 3 . ?2 .
  • V1 2.50 (.18) .0 .0 18.75
  • V2 3.00 (.21) .0 .0 27.00
  • V3 3.50 (.25) .0 .0 36.75
  • N1 .0 2.10 (.13) .0 4.59
  • N2 .0 2.00 (.14) .0 12.00
  • N3 .0 1.50 (.16) .0 22.75
  • S1 .0 .0 2.40 (.44) 58.24
  • S2 .0 .0 2.70 (.50) 73.71
  • Fac 1 1.0
  • Fac 2 .50 (.04) 1.0
  • Fac 3 .50 (.10) .50 (.10) 1.0

28
Fully Standardized Solution, Sample 3
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • V1 .50 .0 .0 .25
  • V2 .50 .0 .0 .25
  • V3 .50 .0 .0 .25
  • N1 .0 .70 .0 .49
  • N2 .0 .50 .0 .25
  • N3 .0 .30 .0 .09
  • S1 .0 .0 .30 .09
  • S2 .0 .0 .30 .09
  • Fac 1 1.0
  • Fac 2 .50 1.0
  • Fac 3 .50 .50 1.0

29
Oblique Component Solution, Sample 3
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • V1 .704 .002 .005 .496
  • V2 .704 .002 .005 .496
  • V3 .704 .002 .005 .496
  • N1 .105 .715 .065 .575
  • N2 .002 .725 .014 .538
  • N3 .116 .670 .089 .417
  • S1 .005 .002 .735 .540
  • S2 .005 .002 .735 .540
  • Fac 1 1.0
  • Fac 2 .256 1.0
  • Fac 3 .147 .147 1.0

30
An Early Comparison
  • McCloy, Metheny, Knott (1938)
  • Published in Psychometrika
  • Sought to compare Common FA (Thurstones method)
    vs. Principal Components Analysis (Hotellings)
  • Stated that Principal Components can be rotated
  • So, both techniques are different means to same
    end
  • Principal difference
  • Thurstone inserts largest correlation in row in
    the diagonal of each residual matrix
  • Hotelling begins with unities and stays with
    residual values in each residual matrix

31
Hypothetical Factor Matrix (McCloy et al.)
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • 1 .900 .0 .0 .810
  • 2 .800 .0 .0 .640
  • 3 .0 .700 .0 .490
  • 4 .0 .800 .0 .640
  • 5 .0 .0 .900 .810
  • 6 .0 .0 .600 .360
  • 7 .0 .424 .424 .360
  • 8 .566 .566 .0 .640
  • 9 .495 .0 .495 .490
  • 10 .520 .520 .520 .810

32
Rotated Factor Matrix (McCloy et al.)
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • 1 .860 .033 .035 .742
  • 2 .819 .025 .023 .672
  • 3 .014 .726 .000 .527
  • 4 .023 .766 .004 .587
  • 5 .010 .004 .808 .653
  • 6 .008 .029 .645 .417
  • 7 .011 .434 .466 .406
  • 8 .587 .548 .014 .645
  • 9 .516 .038 .512 .530
  • 10 .489 .471 .537 .749

33
Rotated Component Matrix (McCloy et al.)
  • Variable Fac 1 Fac 2 Fac 3 . h2 .
  • 1 .906 .055 .063 .828
  • 2 .874 .053 .046 .769
  • 3 .034 .824 .021 .681
  • 4 .050 .859 .006 .740
  • 5 .060 .000 .885 .787
  • 6 .094 .035 .773 .608
  • 7 .054 .519 .525 .548
  • 8 .653 .558 .029 .739
  • 9 .527 .085 .605 .651
  • 10 .527 .477 .552 .810

34
An Early Comparison
  • McCloy, Metheny, Knott (1938)
  • Argued that
  • both CFA and PCA were means to same end
  • both led to similar pattern of loadings, but
  • Thurstones method was more accurate (?h2 .056)
    than Hotellings (?h2 .125) but these were
    average absolute differences
  • I averaged signed differences, and Thurstones
    method was much accurate (?h2 -.013) than
    Hotellings (?h2 .120)

35
An Early Comparison
  • McCloy, Metheny, Knott (1938)
  • Found similar pattern of high and low loadings
    from PCA and CFA
  • But, they found (but did not stress) that PCA led
    to decidedly higher loadings
  • Tukey (1969)
  • Amount, as well as direction, is vital
  • For any science to advance, we must pay attention
    to quantitative variation, not just qualitative

36
Regularity Conditions or Phenomena
  • Relations between population values of P and R
  • Features of eigenvalues
  • Covariances among residuals
  • Need a theory of errors
  • Recount my first exposure
  • Should have to acknowledge (predict? live with?)
    the patterns in residuals

37
Practicing Scientists vs. Statisticians
  • Interesting dimension along which researchers
    fall
  • Practicing Statisticians
  • scientists (Dark side)
  • use CFA prefer PCA
  • use regression warn of probs
  • analysis errors in vars

38
Practicing Scientists vs. Statisticians
  • At first seems odd
  • Practicing scientist prefers
  • CFA (which partials out errors of measurement and
    specific variance)
  • Regression analysis despite the implicit
    assumption of perfect measurement
  • Statistician prefers
  • To warn of ill-effects of errors in variables on
    results of regression analysis
  • PCA (despite lack of attention to measurement
    error), perhaps due to elegant, reduced rank
    representation

39
Practicing Scientists vs. Statisticians
  • On second thought, is rational
  • Practicing scientist prefers
  • Assumptions that residuals (in CFA or regression
    analysis) are independent, uncorrelated, normally
    distributed
  • Statistician prefers
  • To try to circumvent (or solve) problem of errors
    in variables in regression
  • To relegate errors in variables problems in PCA
    to that part of solution (GG) that is orthogonal
    to the retained part, thereby circumventing (or
    solving) this problem

40
Regularity Conditions or Phenomena
  • In Common Factor Analysis,
  • Char. of correlations ? Char. of variables 11
  • Char. of correlations ? Char. of variables 11
  • In Principal Component Analysis,
  • Char. of correlations ? Char. of variables 11
    (??)
  • Char. of correlations ? Char. of
    variables many1

41
Manifest Correlations
  • Var V1 V2 V3 V4 V5 V6
  • V1 1.00
  • V2 .64 1.00
  • V3 .64 .64 1.00
  • V4 .64 .64 .64 1.00
  • V5 .64 .64 .64 .64 1.00
  • V6 .64 .64 .64 .64 .64 1.00

42
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 1.92 .80 .64 2.28 .87 .76
  • V2 .0 .80 .64 .36 .87 .76
  • V3 .0 .80 .64 .36 .87 .76
  • V4
  • V5
  • V6
  • P1 1.0 1.0
  • P2

43
Residual Covariances CFA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .36 .00 .00
  • V2 .00 .36 .00
  • V3 .00 .00 .36
  • V4
  • V5
  • V6
  • Covs below diag., corrs above diag.

44
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .24 -.50 -.50
  • V2 -.12 .24 -.50
  • V3 -.12 -.12 .24
  • V4
  • V5
  • V6
  • Covs below diag., corrs above diag.

45
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 3.84 .80 .64 4.20 .84 .70
  • V2 .0 .80 .64 .36 .84 .70
  • V3 .0 .80 .64 .36 .84 .70
  • V4 .0 .80 .64 .36 .84 .70
  • V5 .0 .80 .64 .36 .84 .70
  • V6 .0 .80 .64 .36 .84 .70
  • P1 1.0 1.0
  • P2

46
Residual Covariances CFA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .36 .00 .00 .00 .00 .00
  • V2 .00 .36 .00 .00 .00 .00
  • V3 .00 .00 .36 .00 .00 .00
  • V4 .00 .00 .00 .36 .00 .00
  • V5 .00 .00 .00 .00 .36 .00
  • V6 .00 .00 .00 .00 .00 .36
  • Covs below diag., corrs above diag.

47
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .30 -.20 -.20 -.20 -.20 -.20
  • V2 -.06 .30 -.20 -.20 -.20 -.20
  • V3 -.06 -.06 .30 -.20 -.20 -.20
  • V4 -.06 -.06 -.06 .30 -.20 -.20
  • V5 -.06 -.06 -.06 -.06 .30 -.20
  • V6 -.06 -.06 -.06 -.06 -.06 .30
  • Covs below diag., corrs above diag.

48
Regularity Conditions or Phenomena
  • In Common Factor Analysis,
  • If (a) the model fits in the population, (b)
    there is one factor, and (c) communalities are
    estimated optimally,
  • Single non-zero eigenvalue
  • Factor loadings and residual variances for first
    three variables are unaffected by addition of 3
    identical variables
  • Residuals specific error variance
  • Residual matrix is diagonal

49
Regularity Conditions or Phenomena
  • In Principal Component Analysis,
  • If (a) the common factor model fits in the
    population, (b) there is one factor, and (c)
    unities are retained on the main diagonal,
  • Single large eigenvalue, plus (p 1) identical,
    smaller eigenvalues
  • Residual component matrix G is independent of the
    space defined by Fc
  • But, residual covariance matrix is clearly
    non-diagonal
  • And, (a) population component loadings and (b)
    residual variances and covariances vary as a
    function of number of manifest variables!

50
Manifest Correlations
  • Var V1 V2 V3 V4 V5 V6
  • V1 1.00
  • V2 .36 1.00
  • V3 .36 .36 1.00
  • V4 .36 .36 .36 1.00
  • V5 .36 .36 .36 .36 1.00
  • V6 .36 .36 .36 .36 .36 1.00

51
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 1.08 .60 .36 1.72 .76 .57
  • V2 .0 .60 .36 .64 .76 .57
  • V3 .0 .60 .36 .64 .76 .57
  • V4
  • V5
  • V6
  • P1 1.0 1.0
  • P2

52
Residual Covariances CFA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .64 .00 .00
  • V2 .00 .64 .00
  • V3 .00 .00 .64
  • V4
  • V5
  • V6
  • Covs below diag., corrs above diag.

53
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .43 -.50 -.50
  • V2 -.21 .43 -.50
  • V3 -.21 -.21 .43
  • V4
  • V5
  • V6
  • Covs below diag., corrs above diag.

54
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 2.16 .60 .36 4.80 .68 .47
  • V2 .0 .60 .36 .64 .68 .47
  • V3 .0 .60 .36 .64 .68 .47
  • V4 .0 .60 .36 .64 .68 .47
  • V5 .0 .60 .36 .64 .68 .47
  • V6 .0 .60 .36 .64 .68 .47
  • P1 1.0 1.0
  • P2

55
Residual Covariances CFA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .64 .00 .00 .00 .00 .00
  • V2 .00 .64 .00 .00 .00 .00
  • V3 .00 .00 .64 .00 .00 .00
  • V4 .00 .00 .00 .64 .00 .00
  • V5 .00 .00 .00 .00 .64 .00
  • V6 .00 .00 .00 .00 .00 .64
  • Covs below diag., corrs above diag.

56
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .53 -.20 -.20 -.20 -.20 -.20
  • V2 -.11 .53 -.20 -.20 -.20 -.20
  • V3 -.11 -.11 .53 -.20 -.20 -.20
  • V4 -.11 -.11 -.11 .53 -.20 -.20
  • V5 -.11 -.11 -.11 -.11 .53 -.20
  • V6 -.11 -.11 -.11 -.11 -.11 .53
  • Covs below diag., corrs above diag.

57
Regularity Conditions or Phenomena
  • So, the difference between population
    parameters from CFA and PCA diverge more
  • (a) the fewer the number of indicators per
    dimension, and
  • (b) the lower the true communality
  • But, some regularities still seem to hold
    (although these vary with the number of
    indicators)
  • regular estimates of loadings
  • regular magnitude of residual covariance
  • regular magnitude of residual covariance
  • regular form of eigenvalue structure

58
Regularity Conditions or Phenomena
  • But, what if we have variation in loadings?

59
Manifest Correlations
  • Var V1 V2 V3 V4 V5 V6
  • V1 1.00
  • V2 .64 1.00
  • V3 .64 .64 1.00
  • V4 .48 .48 .48 1.00
  • V5 .48 .48 .48 .36 1.00
  • V6 .48 .48 .48 .36 .36 1.00

60
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 3.00 .80 .64 3.47 .83 .69
  • V2 .0 .80 .64 .64 .83 .69
  • V3 .0 .80 .64 .64 .83 .69
  • V4 .0 .60 .36 .53 .68 .47
  • V5 .0 .60 .36 .36 .68 .47
  • V6 .0 .60 .36 .36 .68 .47
  • P1 1.0 1.0
  • P2

61
Residual Covariances CFA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .36 .00 .00 .00 .00 .00
  • V2 .00 .36 .00 .00 .00 .00
  • V3 .00 .00 .36 .00 .00 .00
  • V4 .00 .00 .00 .64 .00 .00
  • V5 .00 .00 .00 .00 .64 .00
  • V6 .00 .00 .00 .00 .00 .64
  • Covs below diag., corrs above diag.

62
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .31 -.15 -.15 -.21 -.21 -.21
  • V2 -.05 .31 -.15 -.21 -.21 -.21
  • V3 -.05 -.05 .31 -.21 -.21 -.21
  • V4 -.09 -.09 -.09 .53 -.20 -.20
  • V5 -.09 -.09 -.09 -.11 .53 -.20
  • V6 -.09 -.09 -.09 -.11 -.11 .53
  • Covs below diag., corrs above diag.

63
Regularity Conditions or Phenomena
  • So, with variation in loadings
  • One piece of approximate stability
  • regular estimates of loadings
  • But, sacrifice
  • regular magnitude of residual covariance
  • regular magnitude of residual covariance
  • regular form of eigenvalue structure

64
Regularity Conditions or Phenomena
  • But, what if we have multiple factors?
  • Lets start with
  • (a) equal loadings
  • (b) orthogonal factors

65
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 1.08 .60 .0 .64 1.72 .76 .0 .57
  • V2 1.08 .60 .0 .64 1.72 .76 .0 .57
  • V3 .0 .60 .0 .64 .64 .76 .0 .57
  • V4 .0 .0 .60 .64 .64 .0 .76 .57
  • V5 .0 .0 .60 .64 .64 .0 .76 .57
  • V6 .0 .0 .60 .64 .64 .0 .76 .57
  • P1 1.0 1.0
  • P2 .0 1.0 .0 1.0

66
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .43 -.50 -.50 .00 .00 .00
  • V2 -.21 .43 -.50 .00 .00 .00
  • V3 -.21 -.21 .43 .00 .00 .00
  • V4 .00 .00 .00 .43 -.50 -.50
  • V5 .00 .00 .00 -.21 .43 -.50
  • V6 .00 .00 .00 -.21 -.21 .43
  • Covs below diag., corrs above diag.

67
Regularity Conditions or Phenomena
  • So, strange result
  • Same factor inflation as with 1-factor, 3
    indicators
  • Same within-factor residual covariances as for
    1-factor, 3 indicators
  • But, between-factor residual covariances 0!
  • Lets go to
  • (a) equal loadings, but
  • (b) oblique factors

68
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 1.62 .60 .0 .64 2.26 .76 .0 .57
  • V2 .54 .60 .0 .64 1.18 .76 .0 .57
  • V3 .0 .60 .0 .64 .64 .76 .0 .57
  • V4 .0 .0 .60 .64 .64 .0 .76 .57
  • V5 .0 .0 .60 .64 .64 .0 .76 .57
  • V6 .0 .0 .60 .64 .64 .0 .76 .57
  • P1 1.0 1.0
  • P2 .5 1.0 .31 1.0

69
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .43 -.50 -.50 .00 .00 .00
  • V2 -.21 .43 -.50 .00 .00 .00
  • V3 -.21 -.21 .43 .00 .00 .00
  • V4 .00 .00 .00 .43 -.50 -.50
  • V5 .00 .00 .00 -.21 .43 -.50
  • V6 .00 .00 .00 -.21 -.21 .43
  • Covs below diag., corrs above diag.

70
Regularity Conditions or Phenomena
  • So, strange result
  • Same factor inflation as with 1-factor, 3
    indicators
  • Reduced correlation between factors
  • But, residual covariances matrix is identical!
  • Lets go to
  • (a) unequal loadings, and
  • (b) orthogonal factors

71
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 1.16 .80 .0 .36 1.70 .83 .0 .68
  • V2 1.16 .60 .0 .64 1.70 .78 .0 .61
  • V3 .0 .40 .0 .84 .79 .64 .0 .41
  • V4 .0 .0 .80 .36 .79 .0 .83 .68
  • V5 .0 .0 .60 .64 .51 .0 .78 .61
  • V6 .0 .0 .40 .84 .51 .0 .64 .41
  • P1 1.0 1.0
  • P2 .0 1.0 .0 1.0

72
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .32 -.47 -.48 .00 .00 .00
  • V2 -.16 .39 -.55 .00 .00 .00
  • V3 -.21 -.26 .59 .00 .00 .00
  • V4 .00 .00 .00 .32 -.47 -.48
  • V5 .00 .00 .00 -.16 .39 -.55
  • V6 .00 .00 .00 -.21 -.26 .59
  • Covs below diag., corrs above diag.

73
Regularity Conditions or Phenomena
  • So, strange result
  • Different factor inflation than with 1-factor, 3
    indicators
  • Reduced correlation between factors
  • But, residual covariances matrix has unequal
    covariances and correlations among residuals, but
    between-factor covariances 0!
  • Lets go to
  • (a) unequal loadings, and
  • (b) oblique factors

74
Eigenvalues, Loadings, and Explained Variance
  • Var EV P1 P2 h2 EVc Pc1 Pc2 hc2
  • V1 1.74 .80 .0 .36 2.27 .77 .11 .66
  • V2 .58 .60 .0 .64 1.16 .77 .00 .59
  • V3 .0 .40 .0 .84 .79 .71 -.12 .46
  • V4 .0 .0 .80 .36 .77 .11 .77 .66
  • V5 .0 .0 .60 .64 .52 .00 .77 .59
  • V6 .0 .0 .40 .84 .49 -.12 .71 .46
  • P1 1.0 1.0
  • P2 .5 1.0 .32 1.0

75
Residual Covariances PCA
  • Var V1 V2 V3 V4 V5 V6
  • V1 .34 -.38 -.49 -.13 -.11 .01
  • V2 -.14 .41 -.59 -.11 -.04 .07
  • V3 -.21 -.28 .54 .01 .07 .16
  • V4 -.04 -.04 .00 .34 -.38 -.49
  • V5 -.04 -.02 .03 -.14 .41 -.59
  • V6 .00 .03 .08 -.21 -.28 .54
  • Covs below diag., corrs above diag.

76
Regularity Conditions or Phenomena
  • So, strange result
  • Extremely different factor inflation than with
    1-factor, 3 indicators
  • Largest loading is now UNderrepresented
  • Very different population factor loadings (.8,
    .6, .4) have very similar component loadings
  • Now, between-factor covariances are not zero, and
    some are positive!

77
R from Component Parameters
  • All the preceding from a CFA view
  • Develop parameters from a CF model
  • Analyze using CFA and PCA
  • CFA procedures recover parameters
  • PCA procedures exhibit failings or anomalies
  • So What? What else could you expect?
  • Challenge (to me)
  • Generate data from a PC model
  • Analyze using CFA and PCA
  • PCA should recover parameters, CFA should exhibit
    problems and/or anomalies

78
R from Component Parameters
  • Difficult to do
  • Leads to
  • Impractical, unacceptable outcomes, from the
    point of view of the practicing scientist
  • Crucial indeterminacies with the PCA model

79
R from Component Parameters
  • Impractical, unacceptable outcomes, from the
    point of view of the practicing scientist

80
Manifest Correlations
  • Var V1 V2 V3 V4 V5 V6
  • V1 1.00
  • V2 .46 1.00
  • V3 .46 .46 1.00
  • V4
  • V5
  • V6
  • First principal component has 3 loadings of .8
  • First principal factor has 3 loadings of
    (.46)1/2, or about .67

81
Manifest Correlations
  • Var V1 V2 V3 V4 V5 V6
  • V1 1.00
  • V2 .568 1.00
  • V3 .568 .568 1.00
  • V4 .568 .568 .568 1.00
  • V5 .568 .568 .568 .568 1.00
  • V6 .568 .568 .568 .568 .568 1.00
  • First principal component has 6 loadings of .8
  • First principal factor has 6 loadings of
    (.568)1/2, or about .75
  • But, one would have to alter the first 3 tests,
    as their population correlations are altered

82
R from Component Parameters
  • Crucial indeterminacies with the PCA model
  • Consider case of well-identified CFA model 6
    manifest variables loading on a single factor
  • One could easily construct the population matrix
    as FF uniquenesses to ensure diag(R) I
  • With 6 manifest variables, 6(7)/2 21 unique
    elements of covariance matrix
  • 12 parameter estimates
  • therefore 9 df

83
R from Component Parameters
  • Crucial indeterminacies with the PCA model
  • Consider now 6 manifest variables with defined
    loadings on first PC
  • To estimate the correlation matrix, must come up
    with the remaining 5 PCs
  • A start Fc G Fc G diag, so
    orthogonality constraint yields 6(5)/2 15
    equations
  • Sum of squares across rows 1, so 6 more
    equations
  • In short, 15 equations, but 30 unknowns (loadings
    of 6 variables on the 5 components in G)
  • Therefore, an infinite of R matrices will lead
    to the stated first PC

84
R from Component Parameters
  • Crucial indeterminacies with the PCA model
  • Related to the Ledermann number, but in reverse
  • For example, with 10 manifest variables, one can
    minimally overdetermine no more than 6 factors
    (so use 6 or fewer factors)
  • But, here, one must specify at least 6 components
    (to ensure more equations than unknowns) to
    ensure a unique R
  • If fewer than 6 components are specified, an
    infinite number of solutions for R can be found

85
Conclusions CFA
  • CFA factor models may not hold in the population
  • But, if they do (in a theoretical population)
  • The notion of a population factor loading is
    realistic
  • The population factor loading is unaffected by
    presence of other variables, as long as the
    battery contains the same factors
  • In one-factor case, loadings can vary from 0 to 1
    (provided reflection of variables is possible)
  • This generalizes to the case of multiple factors

86
Conclusions CFA
  • CFA factor models may not hold in the population
  • But, if they do
  • Residual (i.e., unique) variances are
    uncorrelated
  • Magnitude of unique variance for a given variable
    is unaffected by other variables in the analysis

87
Conclusions PCA
  • PCA factor models cannot hold in the population
    (because all variables have measurement error)
  • Moreover
  • The notion of the population component loading
    for a particular manifest variable is meaningless
  • The population component loading is affected
    strongly by presence of other variables
  • SEs for component loadings have no interpretation
  • In the one-component case, component loadings can
    only vary from (1/m)1/2 to 1, where m is the
    number of indicators for the dimension
  • Generalizes to multiple component case

88
Conclusions PCA
  • PCA factor models cannot hold in the population
    (because all variables have measurement error)
  • Moreover
  • Residual variables are correlated, often in
    unpredictable and seemingly haphazard fashion
  • Magnitude of unique variance and covariances for
    a given manifest variable are affected by other
    variables in the analysis

89
Conclusions PCA
  • PCA factor models cannot hold in the population
    (because all variables have measurement error)
  • Moreover
  • Finally, generating data from a PC model leads
    either to
  • Impractical, unacceptable outcomes
  • Indeterminacies in the parameter R relations

90
(No Transcript)
91
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com