Determining the - PowerPoint PPT Presentation

About This Presentation
Title:

Determining the

Description:

Scree -- the 'junk' that piles up at the foot of an glacier. a ' ... This is a common scree configuration when factoring items from a multi-subscale scale! ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 16
Provided by: Gar68
Learn more at: https://psych.unl.edu
Category:

less

Transcript and Presenter's Notes

Title: Determining the


1
Determining the Of PCs
  • Remembering the process
  • Some cautionary comments
  • Statistical approaches
  • Mathematical approaches
  • Nontrivial factors approaches
  • Help thats coming later

2
How the process really works
  • Heres the series of steps we talked about
    earlier.
  • factors decision
  • Rotate the factors
  • interpreting the factors
  • factor scores

These steps arent made independently and done
in this order!
Considering the interpretations of the factors
can aid the factors decision!
Considering how the factor scores (representing
the factors) relate to each other and to
variables external to the factoring can aid both
the factors decision and interpretation.
3
Some cautionary comments
  • Remember that the factors decision ...
  • is influenced by the particular variables in the
    analysis
  • so, unless you are working with a closed set of
    variables, there probably isnt a real
    factors
  • the whole story includes how the factors
    changes with what variable additions and
    deletions
  • how do these change your interpretation of
    factors and variables
  • isnt independent of interpretability
  • a factor is only real if its meaningful
  • be cautious of both making up and missing
    meaning

4
Some cautionary comments, cont.
  • agreement across decision rules is helpful
  • well talk about several decision rules, each of
    which is flawed in known ways
  • replication is convincing
  • split-half and hold-out sampling can help
  • separate-sample replication is more convincing
  • convergence research is more convincing
  • not just replicating, but correctly anticipating
    what will be the results of adding, deleting
    variables across samplings
  • Remember this is Exploratory factoring
  • explore consider alternative factor solutions
  • Want to be really convincing? Use confirmatory
    factoring!!

5
Statistical Procedures
  • PC analyses are extracted from a correlation
    matrix
  • PCs should only be extracted if there is
    systematic covariation in the correlation
    matrix
  • This is know as the sphericity question
  • Note the test asks if there the next PC should
    be extracted
  • There are two different sphericity tests
  • Whether there is any systematic covariation in
    the original R
  • Whether there is any systematic covariation left
    in the partial R, after a given number of factors
    has been extracted
  • Both tests are called Bartletts Sphericity Test

6
Statistical Procedures, cont.
  • Applying Bartletts Sphericity Tests
  • Retaining H0 means dont extract another
    factor
  • Rejecting H0 means extract the next factor
  • Significance tests provide a p-value, and so a
    known probability that the next factor is 1 too
    many (a type I error)
  • Like all significance tests, these are influenced
    by N
  • larger N more power more likely to reject H0
    more likely to keep the next factor ( make a
    Type I error)
  • Quandary?!?
  • Samples large enough to have a stable R are
    likely to have excessive power and lead to
    over factoring
  • Be sure to consider variance, replication
    interpretability

7
Mathematical Procedures
  • The most commonly applied decision rule (and the
    default in most stats packages -- chicken egg
    ?) is the ? gt 1.00 rule heres the logic
  • Part 1
  • Imagine a spherical R (of k variables)
  • each variable is independent and carries unique
    information
  • so, each variable has 1/kth of the information in
    R
  • For a normal R (of k variables)
  • each variable, on average, has 1/kth of the
    information in R

8
Mathematical Procedure, cont.
  • Part 2
  • The trace of a matrix is the sum of its
    diagonal
  • So, the trace of R (with 1s in the diag) k (
    vars)
  • ? tells the amount of variance in R accounted for
    by each extracted PC
  • for a full PC solution ? ? k (accounts for all
    variance)
  • Part 3
  • PC is about data reduction and parsimony
  • trading fewer more-complex things (PCs - linear
    combinations of variables) for fewer more-simple
    things (original variables)

9
Mathematical Procedure, cont.
  • Putting it all together (hold on tight !)
  • Any PC with ? gt 1.00 accounts for more variance
    than the average variable in that R
  • That PC has parsimony -- the more complex
    composite has more information than the average
    variable
  • Any PC with ? lt 1.00 accounts for less variance
    than the average variable in that R
  • That PC doesnt have parsimony -- the more
    complex composite has more no information than
    the average variable

10
Mathematical Procedure, cont.
  • There have been examinations the accuracy of this
    criterion
  • The usual procedure is to generate a set of
    variables from a known number of factors (vk
    b1kPC1 bfkPCf, etc.) --- while varying N,
    factors, PCs communalities
  • Then factor those variables and see if ? gt 1.00
    leads to the correct number of factors
  • Results -- the rule works pretty well on the
    average, which really means that it gets the
    factors right some times, underestimates
    sometimes and overestimates sometimes
  • No one has generated an accurate rule for
    assessing when which of these occurs
  • But the rule is most accurate with k lt 40, f
    between k/5 and k/3 and N gt 300

11
Nontrivial Factors Procedures
  • These common sense approaches became increasing
    common as
  • the limitations of statistical and mathematical
    procedures became better known
  • the distinction between exploratory and
    confirmatory factoring developed and the crucial
    role of successful exploring became better
    known
  • These procedures are more like judgement calls
    and require greater application of content
    knowledge and persuasion, but are often the
    basis of good factorings !!

12
Nontrivial factors Procedures, cont.
  • Scree -- the junk that piles up at the foot of
    an glacier
  • a diminishing returns approach
  • plot the ? for each factor and look for the
    elbow
  • Old rule -- factors elbow (1966 3 below)
  • New rule -- factors elbow - 1 (1967 2
    below)
  • Sometimes there isnt a clear elbow -- try
    another rule
  • This approach seems to work best when combined
    with attention to interpretability !!

? 4 2 0
PC 1 2 3 4 5 6
13
An Example
A buddy in graduate school wanted to build a
measure of contemporary morality. He started
with the 10 Commandments and the 7 Deadly
Sins and created a 56-item scale with 8
subscales. His scree plot looked like How many
factors?
?
1? big elbow at 2, so 67 rule suggests a
single factor, which clearly accounts
for the biggest portion of variance 7? smaller
elbow at 8, so 67 rule suggests 7 8? smaller
elbow at 8, 66 rule gives the 8 he was looking
for also 8th has ? gt 1.0 and 9th had ? lt 1.0
0 1 10 20
1 8 20
40 56
  • Remember that these are subscales of a central
    construct, so..
  • items will have substantial correlations both
    within and between subscales
  • to maximize the variance accounted for, the
    first factor is likely to pull in all these
    inter-correlated variables, leading to a large ?
    for the first (general) factor and much smaller
    ?s for subsequent factors
  • This is a common scree configuration when
    factoring items from a multi-subscale scale!

14
Nontrivial factors Procedures, cont.
  • of variance accounted for
  • keep the factors necessary to account for
    enough variance -- 75 to 90 are common goals
  • Interpretability -- meaningfulness of resulting
    PCs
  • Depends greatly upon content knowledge
  • Beware factoring illusions
  • Were good at finding patterns, even when
    theyre not really there
  • Rotational Survival -- akin to meaningfulness
  • Consider different factors with different types
    of rotation -- see which factors keep showing
    up
  • Replicability -- split, holdout, or independent
    samples
  • What PCs appear consistently across factorings?
  • Jack-knifing
  • Re-sampling from a single dataset looking for
    consistency of factors

15
Help thats coming later
  • If you have a reasonably clear factor structure
    all the different ways of deciding the factors
    are likely to give the same result (except maybe
    statistical likely to over-factor with N)
  • Remember that what the factors are can be very
    important in deciding how many factors there
    are
  • Consider the different interpretations of the
    factors from the different -of-factors solutions
  • we can also look at the correlations between the
    factors to help with these decisions
  • Remember that what the factors do can be very
    important in deciding how many factors there
    are
  • you can look at how factors from the different
    -of-factor solutions correlate with other
    variables that are not in the factor analysis
Write a Comment
User Comments (0)
About PowerShow.com