Title: Nonlinear Dimension Reduction
1Nonlinear Dimension Reduction
- Presenter Xingwei Yang
- The powerpoint is organized from
- 1.Ronald R. Coifman et al. (Yale University)
- 2. Jieping Ye, (Arizona State University)
2Motivation
Linear projections will not detect the pattern.
3Nonlinear PCA using Kernels
- Traditional PCA applies linear transformation
- May not be effective for nonlinear data
- Solution apply nonlinear transformation to
potentially very high-dimensional space. - Computational efficiency apply the kernel trick.
- Require PCA can be rewritten in terms of dot
product.
More on kernels later
4Nonlinear PCA using Kernels
- Rewrite PCA in terms of dot product
The covariance matrix S can be written as
Let v be The eigenvector of S corresponding to
nonzero eigenvalue
Eigenvectors of S lie in the space spanned by all
data points.
5Nonlinear PCA using Kernels
The covariance matrix can be written in matrix
form
Any benefits?
6Nonlinear PCA using Kernels
- Next consider the feature space
The (i,j)-th entry of
is
Apply the kernel trick
K is called the kernel matrix.
7Nonlinear PCA using Kernels
- Projection of a test point x onto v
Explicit mapping is not required here.
8Diffusion distance and Diffusion map
8/14
- A symmetric matrix Ms can be derived from M as
- M and Ms has same N eigenvalues,
- Under random walk representation of the graph M
f left eigenvector of M y right eigenvector
of M
e time step
9Diffusion distance and Diffusion map
- e has the dual representation (time step and
kernel width).
- If one starts random walk from location xi , the
probability of - landing in location y after r time steps
is given by - For large e, all points in the graph are
connected (Mi,j gt0) and - the eigenvalues of M
-
where ei is a row vector with all zeros except
that ith position 1.
10Diffusion distance and Diffusion map
- One can show that regardless of starting point
xi
Left eigenvector of M with eigenvalue l01
with
- Eigenvector f0(x) has the dual representation
- 1. Stationary probability distribution on
the curve, i.e., the - probability of landing at location x
after taking infinite - steps of random walk (independent of the start
location). - 2. It is the density estimate at location
x.
11Diffusion distance
- yk and fk are the right and left eigenvectors
of graph Laplacian M. - is the kth eigenvalue of M r (arranged in
descending order).
- Given the definition of random walk, we denote
Diffusion - distance as a distance measure at time t
between two pmfs as
with empirical choice w(y)1/f0(y).
12Diffusion Map
- Diffusion map Mapping between original space
and first - k eigenvectors as
Relationship
- This relationship justifies using Euclidean
distance in diffusion - map space for spectral clustering.
- Since , it is justified
to stop at appropriate k with - a negligible error of order O(lk1/lk)t).
13Example Hourglass
14Example Image imbedding
15Example Lip image
16Shape description
17Dimension Reduction of Shape space
18Dimension Reduction of Shape space
19Dimension Reduction of Shape space
20References
- Unsupervised Learning of Shape Manifolds (BMVC
2007) - Diffusion Maps(Appl. Comput. Harmon. Anal. 21
(2006)) - Geometric diffusions for the analysis of data
from sensor networks (Current Opinion in
Neurobiology 2005)