Title: Manifold learning: Nystroms method and a unified view
1Manifold learning Nystroms method and a unified
view
Jieping Ye Department of Computer Science and
Engineering Arizona State University http//www.pu
blic.asu.edu/jye02
2Overview
- Isomap
- Global approach
- Preserve all pairwise Geodesic distances
- LLE and Laplacian Eigenmaps
- Local approach
- Preserve local geometry derived from k-nearest
neighbors - All of them involve the eigen-decompsition
3Outline of lecture
- A common framework
- MDS
- Isomap
- LLE
- Laplacian Eigenmaps
- Nystroms method
- Approximate Nystroms method
- Landmark MDS
4An example
5A common framework for manifold learning
6Flowchart of the framework
Construct neighborhood Graph (K NN)
Construct the embedding based on the eigenvectors
Form similarity matrix M
Compute the eigenvectors of
Normalize M to
optional
7MDS
Kernel PCA with a linear kernel
8Isomap
Key difference Euclidean distance ? Geodesic
distance
Geodesic distance
Kernel PCA with kernel constructed from the
geodesic distance
What is the difference?
may not be positive semi-definite
Solution (1) Add a large constant to its
diagonal (2) Remove the negative
eigenvalues
9LLE
Least squares problem
Meaning of W a linear representation of every
data point by its neighbors This is an intrinsic
geometrical property of the manifold
Low-dimensional embedding compute the bottom
eigenvectors of
This is equivalent to computing the principal
(top) eigenvectors of
Kernel PCA with a data dependent kernel
10Laplacian Eigenmaps
Let S be the degree matrix of the affinity matrix
M
Kernel PCA with this kernel
11A summary
- A unified framework of manifold learning
algorithms - They are kernel PCA with data dependent kernels,
however no specific function for embedding is
given. - How to compute embedding for a test point?
- All methods involve the eigen-decomposition of a
certain matrix of size n (n is the number of data
points) - They do not scale to large datasets
- MDS and Isomap
Nystroms method
12Nystrom Method
- It is originally proposed to approximate the
solution of Fredholm integral equations - It can be approximated by
- It can be used to approximate the eigenvectors
and eigenvalues of K using those of small
submatrix A.
13Nystrom Method
- Apply the above approximation to the m sample
points
14Nystrom Method
- For a test point x, the value of the integral is
given by - If we only use a subset of points for the
approximation, the size of the matrix for the
eigen-decomposition is small - Scale with the number of points chosen
- It can be used to approximate the eigenvectors
and eigenvalues of K using those of small
submatrix A.
15Exact Nystrom Approximation
- Suppose that K of size m by m has rank r lt m
- K is positive semi-definite (kernel gram matrix)
- Order the rows and the columns of K so that the
first block (r by r) is nonsingular
nr m
16Nystrom Method
17Nystrom Method
18Nystrom Method
19Approximate Nystrom Method
Approximation
Evaluate the approximation on all m data points
(in matrix form)
20Approximate Nystrom Method
Evaluate the approximation on all d data points
(in matrix form)
eigenvector eigenvalue
21Approximate Nystrom Method
Compared to the exact case
may not be orthonormal
22Landmark MDS
- MDS is expensive for large datasets
- The distance matrix is dense
- Complexity of computing eigenvectors is
- Solve this problem by Landmark MDS
- Choose a subset of q points, called landmarks
(qltltm) - Perform MDS on these q points, mapping them to
d-dimensional space - Map the remaining points using only their
distances to the landmarks
Reference FastMap, MetricMap, and Landmark MDS
are all Nystrom Algorithms
23Landmark MDS
For a test point, compute its distance to each of
the landmarks and form
Its eigenvectors are approximates by
The embedding of the test point is given by the
first d elements of
24Next class
- Topics
- Kernel methods
- Readings
- A Primer on Kernel Methods
- http//www.kyb.mpg.de/publications/pdfs/pdf2549.pd
f