Principal Components presentation

About This Presentation

Transcript and Presenter's Notes

Title: Principal Components

1
Principal Components
Principal components is a method of dimension
reduction. Suppose that you have a dozen
variables that are correlated. You might use
principal components analysis to reduce your 12
measures to a few principal components. Unlike
factor analysis, principal components analysis is
not usually used to identify underlying latent
variables.
2
Principal Components
Principal components is a technique that requires
a large sample size. Principal components is
based on the correlation matrix of the variables
involved, and correlations usually need a large
sample size before they stabilize.
3
Principal Components
As a rule of thumb, a bare minimum of 10
observations per variable is necessary to avoid
computational difficulties.
Comrey Lee (1992) A First Course In Factor
Analysis
4
Principal Components
In this example we have included many options,
while you may not wish to use all of these
options, we have included them here to aid in the
explanation of the analysis.
5
Principal Components
In this example we examine students assessment of
academic courses. We restrict attention to 12
variables.
Scored on a five point Likert scale.
6
Principal Components
In this example we examine students assessment of
academic courses. We restrict attention to 12
variables.
Scored on a five point Likert scale.
7
Principal Components
Analyze gt Dimension Reduction gt Factor
8
Principal Components
Select variables 13-24 that is instructor well
prepared to compared to other courses this
course was. By using the arrow button.
Use the buttons at the side of the screen to set
additional options.
9
Principal Components
Use the buttons at the side of the screen to set
the Descriptives employ the Continue button to
return to the main Factor Analysis screen.
10
Principal Components
Use the buttons at the side of the screen to set
the Extraction employ the Continue button to
return to the main Factor Analysis screen.
Select the appropriate method and the eigen value
criteria, set at 1. It is essential to obtain a
scree plot.
11
Principal Components
Select the OK button to proceed with the analysis.
12
Principal Components
The descriptive statistics table is output
because we used the univariate option.
13
Principal Components
Mean - These are the means of the variables used
in the factor analysis. Are these appropriate
for a Likert scale?
14
Principal Components
Std. Deviation - These are the standard
deviations of the variables used in the factor
analysis. Are these appropriate for a Likert
scale?
15
Principal Components
Analysis N - This is the number of cases used in
the factor analysis.
16
Principal Components
The correlation matrix table was included in the
output because we included the correlation
option. This table gives the correlations
between the original variables (which were
specified). Before conducting a principal
components analysis, you want to check the
correlations between the variables. If any of the
correlations are too high (say above 0.9), you
may need to remove one of the variables from the
analysis, as the two variables seem to be
measuring the same thing. Another alternative
would be to combine the variables in some way
(perhaps by taking the average).
17
Principal Components
If the correlations are too low, say below 0.1,
then one or more of the variables might load only
onto one principal component (in other words,
make its own principal component). This is not
helpful, as the whole point of the analysis is to
reduce the number of items (variables).
18
Principal Components
The correlation matrix is extremely large.
19
Principal Components
The correlation matrix is extremely large.
20
Principal Components
Kaiser-Meyer-Olkin Measure of Sampling Adequacy
This measure varies between 0 and 1, and values
closer to 1 are better. A value of 0.6 is a
suggested minimum.
21
Principal Components
Bartlett's Test of Sphericity - This tests the
null hypothesis that the correlation matrix is an
identity matrix. An identity matrix is matrix in
which all of the diagonal elements are 1 and all
off diagonal elements are 0. You want to reject
this null hypothesis.
22
Principal Components
Taken together, these tests provide a minimum
standard, which should be passed before a
principal components analysis (or a factor
analysis) should be conducted.
23
Principal Components
Communalities - This is the proportion of each
variable's variance that can be explained by the
principal components (e.g. the underlying latent
continua).
24
Principal Components
Initial - By definition, the initial value of the
communality in a principal components analysis is
1.
25
Principal Components
Extraction - The values in this column indicate
the proportion of each variable's variance that
can be explained by the principal components.
Variables with high values are well represented
in the common factor space, while variables with
low values are not well represented. (In this
example, we don't have any particularly low
values.)
26
Principal Components
Component - There are as many components
extracted during a principal components analysis,
as there are variables that are put into it. In
our example, we used 12 variables (item13 through
item24), so we have 12 components.
27
Principal Components
Initial eigen values - eigen values are the
variances of the principal components. Because we
conducted our principal components analysis on
the correlation matrix, the variables are
standardized, which means that the each variable
has a variance of 1, and the total variance is
equal to the number of variables used in the
analysis, in this case, 12.
28
Principal Components
Initial eigen values - Total - This column
contains the eigen values. The first component
will always account for the most variance (and
hence have the highest eigen value), and the next
component will account for as much of the left
over variance as it can, and so on. Hence, each
successive component will account for less and
less variance.
29
Principal Components
Initial eigen values - of Variance - This
column contains the percent of variance accounted
for by each principal component (6.249/12 0.52).
30
Principal Components
Initial eigen values - Cumulative - This column
contains the cumulative percentage of variance
accounted for by the current and all preceding
principal components. For example, the second row
shows a value of 62.322. This means that the
first two components together account for 62.322
of the total variance.
31
Principal Components
Extraction Sums of Squared Loadings - The three
columns in this half of the table exactly
reproduce the values given on the same row on the
left side of the table. The number of rows
reproduced on the right side of the table is
determined by the number of principal components
whose eigen values are 1 or greater.
Totally agree
32
Principal Components
The scree plot graphs the eigen value against the
component number.
33
Principal Components
In general, we are interested in keeping only
those principal components whose eigen values are
greater than 1 (we set this value).
34
Principal Components
Component Matrix - This table contains component
loadings, which are the correlations between the
variable and the component. Because these are
correlations, possible values range from -1 to
1. It is usual to not report any correlations
that are less than .3. As shown.
35
Principal Components
Component - The columns under this heading are
the principal components that have been
extracted. As you can see by the footnote
provided by SPSS, two components were extracted
(the two components that had an eigen value
greater than 1).
36
Principal Components
You usually do not try to interpret the
components in the way that you would factors that
have been extracted from a factor analysis.
Rather, most people are interested in the
component scores, which are used for dimension
reduction (as opposed to factor analysis where
you are looking for underlying latent continua).
37
Principal Components
For a component plot employ the Rotation option
38
Principal Components
Its always wise to plot your results. Note the
clusters.
39
Principal Components
Summary Principal Components is used to help
understand the covariance structure in the
original variables and/or to create a smaller
number of variables using this structure. Factor
Analysis like principal components is used to
summarise the data covariance structure in a
smaller number of dimensions. The emphasis is the
identification of underlying factors that might
explain the dimensions associated with large data
variability.
40
Similarities
Principal Components Analysis and Factor
Analysis have these assumptions in
common Measurement scale is interval or ratio
level. Random sample - at least 5 observations
per observed variable and at least 100
observations. Larger sample sizes recommended
for more stable estimates, 10-20 observations per
observed variable.
41
Similarities
Principal Components Analysis and Factor
Analysis have these assumptions in common Over
sample to compensate for missing values Linear
relationship between observed variables Normal
distribution for each observed variable Each
pair of observed variables has a bivariate normal
distribution Are both variable reduction
techniques. If communalities are large, close to
1.00, results could be similar.
42
Similarities
Principal Components Analysis assumes the
absence of outliers in the data. Factor
Analysis assumes a multivariate normal
distribution when using Maximum Likelihood
extraction method.
43
Differences

Write a Comment

User Comments (0)

About PowerShow.com

Principal Components PowerPoint PPT Presentation