Title: Nonlinear Dimensionality Reduction Frameworks
1Nonlinear Dimensionality Reduction Frameworks
2Outline
- Intuition of Nonlinear Dimensionality
Reduction(NLDR) - ISOMAP
- LLE
- NLDR in Gait Analysis
3Intuition how does your brain store these
pictures?
4Brain Representation
5Brain Representation
- Every pixel?
- Or perceptually meaningful structure?
- Up-down pose
- Left-right pose
- Lighting direction
- So, your brain successfully reduced the
high-dimensional inputs to an intrinsically
3-dimensional manifold!
6Data for Faces
7Data for Handwritten 2
8Data for Hands
9Manifold Learning
- A manifold is a topological space which is
locally Euclidean - An example of nonlinear manifold
10Manifold Learning
- Discover low dimensional representations (smooth
manifold) for data in high dimension. - Linear approaches(PCA, MDS) vs Non-linear
approaches (ISOMAP, LLE)
latent
Y
X
observed
11Linear Approach- PCA
- PCA Finds subspace linear projections of input
data.
12Linear Approach- MDS
- MDS takes a matrix of pairwise distances and
gives a mapping to Rd. It finds an embedding that
preserves the interpoint distances, equivalent to
PCA when those distance are Euclidean. - BUT! PCA and MDS both fail to do embedding with
nonlinear data, like swiss roll.
13Nonlinear Approaches- ISOMAP
Josh. Tenenbaum, Vin de Silva, John langford 2000
- Constructing neighbourhood graph G
- For each pair of points in G, Computing shortest
path distances ---- geodesic distances. - Use Classical MDS with geodesic distances.
- Euclidean distance? Geodesic
distance
14Sample points with Swiss Roll
- Altogether there are 20,000 points in the Swiss
roll data set. We sample 1000 out of 20,000.
15Construct neighborhood graph G
- K- nearest neighborhood (K7)
- DG is 1000 by 1000 (Euclidean) distance matrix of
two neighbors (figure A)
16Compute all-points shortest path in G
- Now DG is 1000 by 1000 geodesic distance matrix
of two arbitrary points along the manifold(figure
B)
17Use MDS to embed graph in Rd
Find a d-dimensional Euclidean space Y(Figure c)
to minimize the cost function
18Linear Approach-classical MDS
Theorem For any squared distance matrix
,there exists of points xi and,xj, such that So
19Solution
Y lies in Rd and consists of N points
correspondent to the N original points in input
space.
20PCA, MD vs ISOMAP
21Isomap Advantages
- Nonlinear
- Globally optimal
- Still produces globally optimal low-dimensional
Euclidean representation even though input space
is highly folded, twisted, or curved. - Guarantee asymptotically to recover the true
dimensionality.
22Isomap Disadvantages
- May not be stable, dependent on topology of data
- Guaranteed asymptotically to recover geometric
structure of nonlinear manifolds - As N increases, pairwise distances provide better
approximations to geodesics, but cost more
computation - If N is small, geodesic distances will be very
inaccurate.