Principal Components - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Principal Components

Description:

We could plot pairs of variables. There are 15 such pairs ... The plot does not 'tilt' in either direction. SM339 Spring 08 - Principal Components ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 17
Provided by: johnt1
Category:

less

Transcript and Presenter's Notes

Title: Principal Components


1
Principal Components
  • As part of a study on football helmets,
    scientists collected head measurements from a
    number of football players
  • They measured 6 different aspects of the players
    heads

2
Principal Components
  • Dealing with 6 dimensional data is difficult
  • We could plot pairs of variables
  • There are 15 such pairs
  • But it turns out that even this may not give a
    clear picture of the data

3
Principal Components
  • Suppose we have data that is distributed over the
    3D plane that goes thru (1,0,0), (0,1,0), (0,0,1)
  • No matter which pair of axes we use, the plots
    will just look like a cloud of points
  • We will never realize that the data actually lies
    on a 2 dimensional surface

4
Principal Components
  • Recall the formula for correlation in simple
    linear regression
  • Corr Sxy/?(SxxSyy)
  • The numerator, Sxy, is proportional to covariance
  • It measures the extent to which larger values of
    X are associated with larger (or smaller) values
    of Y
  • If Cov0, then X and Y are not related

5
Principal Components
  • In terms of a plot, this means that the plot of Y
    vs X is just a cloud of points
  • The plot does not tilt in either direction

6
Principal Components
  • The problem with the data in the plane example is
    that the values are correlated
  • If we could look down the edge of the plane, then
    we would see that there is not a third dimension
    to the data
  • All the data lies in only the two dimensions of
    the plane

7
Principal Components
  • In reality, data rarely lies exactly on a plane
  • But the cloud of points can extend much more in
    some directions than in others
  • Typically, the data forms a cloud that resembles
    an ellipsoid

8
Principal Components
  • If we can find the axes of the ellipsoid, we can
    view the data in terms of these components
  • The longest axis is the most interesting
  • The shortest axis does not have much information

9
Principal Components
  • NOTE There are two ways to proceed
  • We can work with the Covariance matrix
  • Or we can work with the Correlation matrix
  • The theory is developed for the Cov matrix
  • Sometimes the Corr matrix makes more sense

10
Principal Components
  • For variables x1, x2, , xk, define the
    covariance matrix so that the (i,j) element is
    Sxixj
  • This means that the matrix will be symmetric
  • If two variables are uncorrelated, then the
    corresponding element of Cov will be zero (or
    nearly so)

11
Principal Components
  • If we find the eigenvectors and eigenvalues of
    Cov, this will diagonalize the Cov matrix
  • Ccov(data)
  • v,deig(c)
  • D is a diagonal matrix of eigenvalues
  • Diag(d) returns a list of the eigenvalues

12
Principal Components
  • If we transform our data by V, then Cov of the
    transformed data will be D
  • Data2(data-ones(size(data)) diag(means(data)))v
  • Then cov(data2)d
  • This means that all the variables in data2 are
    uncorrelated

13
Principal Components
  • Data2 is called the principal components of data1
  • Furthermore, the variances of data2 are the
    eigenvalues of Cov(data1)
  • This means that the largest eigenvalue is the
    most variable PC
  • This corresponds to the longest axis of the
    original ellipsoid of data

14
Principal Components
  • In some sense, the sum of the e-values is the
    overall variance
  • We can think of the individual e-values in terms
    of what percent of the total they are
  • Eig() tends to return the e-values in ascending
    order
  • We want them in descending order
  • Dsort-sort(-diag(d))
  • Then cumsum(dsort)/sum(dsort) tells us what
    percent of the total would be contained in the
    first k PCs

15
Principal Components
  • General rule use enough PCs to contain 80-90 of
    the total
  • Balance this against how many PCs
  • If only 2-3 PCs contain most of the total, then
    our problem is a lot simpler than we thought
  • Besides plots, we can use the PCs to detect
    groupings of the data or outliers, say

16
Principal Components
  • Other multivariate methods
  • MANOVA ANOVA based on several variables rather
    than just one
  • MV discriminant analysis what distinguishes one
    group from another?
  • Canonical correlation what components of two
    sets of data are most correlated?
  • All of these involve ideas similar to PCA
Write a Comment
User Comments (0)
About PowerShow.com