Dimensionality reduction - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Dimensionality reduction

Description:

Intuition: find the axis that shows the greatest variation, and project all points to this axis ... ATAvi = sivi , viTvi = 1, viTvj = 0 for i j. ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 27
Provided by: amb79
Category:

less

Transcript and Presenter's Notes

Title: Dimensionality reduction


1
Dimensionality reduction
  • DFT
  • Wavelets
  • Space-filling curves
  • Fastmap
  • SVD
  • Embedding of metric spaces
  • Random projections

2
SVD
  • Intuition find the axis that shows the greatest
    variation, and project all points to this axis

f2
e2
e1
f1
3
SVD The mathematical formulation
  • Let A be an m x n real matrix of m n-dimensional
    points
  • SVD decomposition
  • A U x L x VT
  • U(m x m) is orthogonal UTU I
  • V(n x n) is orthogonal VTV I
  • L(m x n) has r positive non-zero singular
    values in descending order on its diagonal
  • Columns of U are the orthogonal eigenvectors of
    AAT (called the left singular vectors of A)
  • AAT (U x L x VT ) (U x L x VT )T U x L x LT x
    UT U x L2 x UT
  • Columns of V are the orthogonal eigenvectors of
    ATA (called the right singular vectors of A)
  • ATA (U x L x VT )T (U x L x VT ) V x LT x L x
    VT V x L2 x VT
  • L contains the square root of the eigenvalues of
    AAT (or ATA)
  • These are called the singular values (positive
    real)
  • r is the rank of A, AAT , ATA
  • U defines the column space of A, V the row space.

4
The symmetric case
  • Let A be real symmetric M x M
  • A AT, orthonormal eigenvectors, real
    eigenvalues
  • A Q x L x QT ? ?iqiqiT (spectral
    decomposition)
  • Q(m x m)
  • Q is orthogonal and consists of the eigenvectors
    of A
  • QT x Q I
  • QT Q-1
  • L(m x m)
  • diagonal matrix consisting of the eigenvalues of
    A
  • guaranteed to be positive for semi-definite A
  • L QT x A x Q (follows from A Q x L x QT)
  • A x AT AT x A A2
  • eigenvectors of A2 eigenvectors of A
  • eigenvalues of A2 are positive (covariance matrix
    is positive semi-definite).
  • eigenvalues of A2 square of the eigenvalues of
    A

5
Proof of SVD (I)
  • Let l1 be the maximal length of Ax for unit
    length vector x, and let y be the unit length
    vector such that Ax l1y. Extend y and x to
    orthogonal bases of Rm and Rn forming the columns
    of U and V respectively.
  • Let U y U1 and V x V1.
  • A22 l12 maximum stretch for a unit vector
    (a norm)
  • Let wT yTAV1 and Y UTAV.
  • Y yT U1T A x V1 (yTAx yTAV1 , U1TAx
    U1TAV1 ) (l1 wT , 0 A1 )
  • A1 U1TAV1
  • Y (l1 wT) l12 w2 , A1w
  • Y22 l12 w2
  • l12 A22 Y22 l12 w2
  • w 0
  • Apply the argument inductively to A1.

check the dimensions
6
Proof of SVD (II)
  • Let si , vi be the set of n (eigenvector,
    eigenvalue) pairs corresponding to ATA. The
    eigenvectors are orthonormal and go into columns
    of V. r non-zero eigenvalues.
  • ATAvi sivi , viTvi 1, viTvj 0 for i ? j.
  • (Avi)T(Avi) viTATAvi viT(ATAvi) viT(sivi)
    siviTvi si
  • Avi2 si (implies si 0)
  • For the positive si, let li vsi and ui Avi /
    li . Rest are 0.
  • uiTuj viTATAvj / li lj sjviTvj / li lj
  • 0 for i?j, since the eigenvectors are orthogonal.
  • 1 for ij
  • Apply Gram-Schmidt and obtain m orthogonal
    eigenvectors to form a basis u1, u2,.. um. These
    go into columns of U.
  • Entry (i,j) of UTAV uiTAvj
  • 0 if j gt r since Avj 0
  • uiT ljuj if j r since Avj ljuj
  • 0 when i ? j and lj otherwise.
  • Therefore the only non-zeros in the product UTAV
    are the first r diagonal entries (which are l1,
    l2, .. lr).
  • UTAV L or A ULVT

7
SVD - Interpretation
8
SVD - Interpretation
  • A U L VT - example

9
SVD - Interpretation
  • A U L VT - example

variance (spread) on the v1 axis
x
x

10
Dimensionality reduction
11
Dimensionality reduction
  • set the smallest eigenvalues to zero

x
x

12
Dimensionality reduction
x
x

13
Dimensionality reduction
x
x

14
Dimensionality reduction
x
x

15
Dimensionality reduction

16
Dimensionality reduction
  • spectral decomposition of the matrix

x
x

17
Dimensionality reduction
  • spectral decomposition of the matrix

l1
x
x

u1
u2
l2
v1T
v2T
18
Dimensionality reduction
  • spectral decomposition of the matrix

n


...
m
19
Dimensionality reduction
  • spectral decomposition of the matrix

n
r terms


...
m
m x 1
1 x n
20
Dimensionality reduction
  • approximation / dim. reduction
  • by keeping the first few terms (Q how many?)

m


...
n
assume l1 gt l2 gt ...
21
Dimensionality reduction
  • A heuristic keep 80-90 of energy ( sum of
    squares of li s)

m


...
n
assume l1 gt l2 gt ...
22
Dimensionality reduction
  • Matrix V in the SVD decomposition
  • (A ULVT ) is used to transform the data.
  • AV ( UL) defines the transformed dataset.
  • For a new data element x, xV defines the
    transformed data.
  • Keeping the first k (k lt n) dimensions, amounts
    to keeping only the first k columns of V.

23
Optimality of SVD
  • Let A U L VT
  • A ? ?iuiviT
  • The Frobenius norm of an m x n matrix M is
  • Let Ak the above summation using the k largest
    eigenvalues.
  • Theorem Eckart and Young Among all m x n
    matrices B of rank at most k, we have that

-

-
B
A
A
A
k
F
F
24
Proof for 2 norm
  • Need to show that A-B2 ?k1
  • Or, there exists a unit vector z such that
    (A-B)z ?k1.
  • Let x1, x2, .. xn-k be a basis for null(B).
  • Choose z to be an unit vector in the intersection
    of span(x1, x2, .. xn-k) and span(v1, v2, ..
    vk1) .
  • z ? ci vi , a linear combination of v1 , v2 ,
    , vk1 ? ci2 1
  • Az (U L VT ) z U L (VT z) U L c1 , c2 ,
    .., ck1 , 0,..,0T
  • U c1 ?1 , c2 ?2 , .., ck1 ?k1 , 0,..,0T
    ? ci?iui
  • Therefore, Az2 ? ci2 ?i2 (? ci2 ) ?k12
    ?k12
  • Since Bz 0, (A-B)z2 ?k1

25
SVD - Complexity
  • O(m2nmn2n3)
  • O(mn2n3) if we only need V, L
  • Complexity can be improved with random sampling
  • incremental techniques can be used for dynamic
    data
  • Implemented in any linear algebra package
    (LINPACK, matlab, Splus, mathematica ...)

26
Principal Components Analysis (PCA)
  • Transfer the dataset to the center by subtracting
    the means let matrix A be the result.
  • Compute the covariance matrix ATA.
  • Project the dataset along a subset of the
    eigenvectors of ATA.
  • Matrix V in the SVD decomposition
  • (A U L VT ) contains the eigenvectors of ATA.
  • Also known as K-L transform.
Write a Comment
User Comments (0)
About PowerShow.com