Dimensionality reduction - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Dimensionality reduction

Description:

Intuition: find the axis that shows the greatest variation, and project all points to this axis ... ATAvi = sivi , viTvi = 1, viTvj = 0 for i j. ... – PowerPoint PPT presentation

Number of Views:65

Avg rating:3.0/5.0

Slides: 27

Provided by: amb79

Category:

more less

Transcript and Presenter's Notes

Title: Dimensionality reduction

1
Dimensionality reduction

DFT
Wavelets
Space-filling curves
Fastmap
SVD
Embedding of metric spaces
Random projections

2
SVD

Intuition find the axis that shows the greatest
variation, and project all points to this axis

f2
e2
e1
f1
3
SVD The mathematical formulation

Let A be an m x n real matrix of m n-dimensional
points
SVD decomposition
A U x L x VT
U(m x m) is orthogonal UTU I
V(n x n) is orthogonal VTV I
L(m x n) has r positive non-zero singular
values in descending order on its diagonal
Columns of U are the orthogonal eigenvectors of
AAT (called the left singular vectors of A)
AAT (U x L x VT ) (U x L x VT )T U x L x LT x
UT U x L2 x UT
Columns of V are the orthogonal eigenvectors of
ATA (called the right singular vectors of A)
ATA (U x L x VT )T (U x L x VT ) V x LT x L x
VT V x L2 x VT
L contains the square root of the eigenvalues of
AAT (or ATA)
These are called the singular values (positive
real)
r is the rank of A, AAT , ATA
U defines the column space of A, V the row space.

4
The symmetric case

Let A be real symmetric M x M
A AT, orthonormal eigenvectors, real
eigenvalues
A Q x L x QT ? ?iqiqiT (spectral
decomposition)
Q(m x m)
Q is orthogonal and consists of the eigenvectors
of A
QT x Q I
QT Q-1
L(m x m)
diagonal matrix consisting of the eigenvalues of
A
guaranteed to be positive for semi-definite A
L QT x A x Q (follows from A Q x L x QT)
A x AT AT x A A2
eigenvectors of A2 eigenvectors of A
eigenvalues of A2 are positive (covariance matrix
is positive semi-definite).
eigenvalues of A2 square of the eigenvalues of
A

5
Proof of SVD (I)

Let l1 be the maximal length of Ax for unit
length vector x, and let y be the unit length
vector such that Ax l1y. Extend y and x to
orthogonal bases of Rm and Rn forming the columns
of U and V respectively.
Let U y U1 and V x V1.
A22 l12 maximum stretch for a unit vector
(a norm)
Let wT yTAV1 and Y UTAV.
Y yT U1T A x V1 (yTAx yTAV1 , U1TAx
U1TAV1 ) (l1 wT , 0 A1 )
A1 U1TAV1
Y (l1 wT) l12 w2 , A1w
Y22 l12 w2
l12 A22 Y22 l12 w2
w 0
Apply the argument inductively to A1.

check the dimensions
6
Proof of SVD (II)

Let si , vi be the set of n (eigenvector,
eigenvalue) pairs corresponding to ATA. The
eigenvectors are orthonormal and go into columns
of V. r non-zero eigenvalues.
ATAvi sivi , viTvi 1, viTvj 0 for i ? j.
(Avi)T(Avi) viTATAvi viT(ATAvi) viT(sivi)
siviTvi si
Avi2 si (implies si 0)
For the positive si, let li vsi and ui Avi /
li . Rest are 0.
uiTuj viTATAvj / li lj sjviTvj / li lj
0 for i?j, since the eigenvectors are orthogonal.
1 for ij
Apply Gram-Schmidt and obtain m orthogonal
eigenvectors to form a basis u1, u2,.. um. These
go into columns of U.
Entry (i,j) of UTAV uiTAvj
0 if j gt r since Avj 0
uiT ljuj if j r since Avj ljuj
0 when i ? j and lj otherwise.
Therefore the only non-zeros in the product UTAV
are the first r diagonal entries (which are l1,
l2, .. lr).
UTAV L or A ULVT

7
SVD - Interpretation
8
SVD - Interpretation

A U L VT - example

9
SVD - Interpretation

A U L VT - example

variance (spread) on the v1 axis
x
x

10
Dimensionality reduction
11
Dimensionality reduction

set the smallest eigenvalues to zero

x
x

12
Dimensionality reduction
x
x

13
Dimensionality reduction
x
x

14
Dimensionality reduction
x
x

15
Dimensionality reduction

16
Dimensionality reduction

spectral decomposition of the matrix

x
x

17
Dimensionality reduction

spectral decomposition of the matrix

l1
x
x

u1
u2
l2
v1T
v2T
18
Dimensionality reduction

spectral decomposition of the matrix

n

...
m
19
Dimensionality reduction

spectral decomposition of the matrix

n
r terms

...
m
m x 1
1 x n
20
Dimensionality reduction

approximation / dim. reduction
by keeping the first few terms (Q how many?)

m

...
n
assume l1 gt l2 gt ...
21
Dimensionality reduction

A heuristic keep 80-90 of energy ( sum of
squares of li s)

m

...
n
assume l1 gt l2 gt ...
22
Dimensionality reduction

Matrix V in the SVD decomposition
(A ULVT ) is used to transform the data.
AV ( UL) defines the transformed dataset.
For a new data element x, xV defines the
transformed data.
Keeping the first k (k lt n) dimensions, amounts
to keeping only the first k columns of V.

23
Optimality of SVD

Let A U L VT
A ? ?iuiviT
The Frobenius norm of an m x n matrix M is
Let Ak the above summation using the k largest
eigenvalues.
Theorem Eckart and Young Among all m x n
matrices B of rank at most k, we have that

-

-
B
A
A
A
k
F
F
24
Proof for 2 norm

Need to show that A-B2 ?k1
Or, there exists a unit vector z such that
(A-B)z ?k1.
Let x1, x2, .. xn-k be a basis for null(B).
Choose z to be an unit vector in the intersection
of span(x1, x2, .. xn-k) and span(v1, v2, ..
vk1) .
z ? ci vi , a linear combination of v1 , v2 ,
, vk1 ? ci2 1
Az (U L VT ) z U L (VT z) U L c1 , c2 ,
.., ck1 , 0,..,0T
U c1 ?1 , c2 ?2 , .., ck1 ?k1 , 0,..,0T
? ci?iui
Therefore, Az2 ? ci2 ?i2 (? ci2 ) ?k12
?k12
Since Bz 0, (A-B)z2 ?k1

25
SVD - Complexity

O(m2nmn2n3)
O(mn2n3) if we only need V, L
Complexity can be improved with random sampling
incremental techniques can be used for dynamic
data
Implemented in any linear algebra package
(LINPACK, matlab, Splus, mathematica ...)

26
Principal Components Analysis (PCA)

Transfer the dataset to the center by subtracting
the means let matrix A be the result.
Compute the covariance matrix ATA.
Project the dataset along a subset of the
eigenvectors of ATA.
Matrix V in the SVD decomposition
(A U L VT ) contains the eigenvectors of ATA.
Also known as K-L transform.

Write a Comment

User Comments (0)