PCA and SVD - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

PCA and SVD

Description:

A popular technique in information/data processing is to transform ... T is eigenvectors of X XT (nxn) S is diag(eig(X XT) (nxm) V is eigenvectors of XT X (mxm) ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 21
Provided by: pdraigcu
Category:

less

Transcript and Presenter's Notes

Title: PCA and SVD


1
PCA and SVD
  • PCA Principle Component Analysis
  • SVD Singular Value Decomposition
  • Resources
  • The Matrix Cookbook
  • http//www.imm.dtu.dk/pubdb/views/edoc_download.ph
    p/3274/pdf/imm3274.pdf

2
PCA and SVD
  • A popular technique in information/data
    processing is to transform the data into a
    different format.
  • Examples represented by one set of features are
    transformed to another set of features.
  • Less features
  • Less noise
  • Typically a linear transformation

3
Principle Component Analysis (PCA)
  • Example 2D data projected to 1D PC
  • Variability of the data can be described using
    only a small number of dimensions
  • Works well when input features are correlated.
  • New dimensions (PCs) are uncorrelated.

4
Linear Discriminant Analysis (LDA)
  • PCs are not necessarily good for discrimination
    in classification.
  • Linear Discriminant Analysis (LDA), seeks to find
    a linear transformation by maximising the
    between-class variance and minimising the
    within-class variance.
  • i.e. discriminating features.

5
Linear Discriminant Analysis
  • Projecting a 2D space to 1 PC

PC not discriminating
(from slides by Shaoqun Wu)
6
Linear Discriminant Analysis
LDA discovers a discriminating projection
PCA
This issue will make more sense later.
7
PCA and SVD
  • Performing PCA is the equivalent of performing
    Singular Value Decomposition (SVD) on the data.
  • Any nxm matrix X can be rewritten as
  • XTSVT
  • T is eigenvectors of XXT (nxn)
  • S is diag(eig(XXT) (nxm)
  • V is eigenvectors of XTX (mxm)

8
Latent Semantic Indexing
  • Latent Semantic Indexing is a method for
    selecting informative subspaces of feature
    spaces.
  • It was developed for information retrieval to
    reveal semantic information from document
    co-occurrences.
  • Terms that did not appear in a document may still
    associate with a document.
  • LSI derives uncorrelated index factors that might
    be considered artificial concepts.

9
Latent Semantic Indexing

10
Deciding relevance based on terms...
(Deerwester et al. 1990)
11
LSI Paper example
Index terms in italics
12
term-document Matrix
13
Latent Semantic Indexing

14
T0
15
S0
16
D0
17
SVD with minor terms dropped
TS define coordinates for documents in latent
space
18
Terms Graphed in Two Dimensions
19
Documents and Terms
20
Change in Text Correlation
Write a Comment
User Comments (0)
About PowerShow.com