Advanced Machine Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Advanced Machine Learning

Description:

PCA: linear manifold MDS: get inter-point distances, find 2D data with same LLE: mimic neighborhoods using low dimensional vectors GTM: ... Invariant Stress = Eg. – PowerPoint PPT presentation

Number of Views:1250
Avg rating:3.0/5.0
Slides: 18
Provided by: jeb75
Category:

less

Transcript and Presenter's Notes

Title: Advanced Machine Learning


1
Advanced Machine Learning Perception
Instructor Tony Jebara
2
Topic 12
  • Manifold Learning (Unsupervised)
  • Beyond Principal Components Analysis (PCA)
  • Multidimensional Scaling (MDS)
  • Generative Topographic Map (GTM)
  • Locally Linear Embedding (LLE)
  • Convex Invariance Learning (CoIL)
  • Kernel PCA (KPCA)

3
Manifolds
  • Data is often embedded in a lower dimensional
    space
  • Consider image of face being translated from
    left-to-right
  • How to capture the true coordinates of the data
    on the
  • manifold or embedding space and represent it
    compactly?
  • Open problem many possible approaches
  • PCA linear manifold
  • MDS get inter-point distances, find 2D data with
    same
  • LLE mimic neighborhoods using low dimensional
    vectors
  • GTM fit a grid of Gaussians to data via
    nonlinear warp
  • Linear after Nonlinear normalization/invariance
    of data
  • Linear in Hilbert space (Kernels)

4
Principal Components Analysis
  • If we have eigenvectors, mean and coefficients
  • Getting eigenvectors (I.e. approximating the
    covariance)
  • Eigenvectors are orthonormal
  • In coordinates of v, Gaussian is diagonal, cov
    L
  • All eigenvalues are non-negative
  • Higher eigenvalues are higher variance, use those
    first
  • To compute the coefficients

5
Multidimensional Scaling (MDS)
  • Idea capture only distances between points X
  • in original space
  • Construct another set of low dim or 2D Y points
    having
  • same distances
  • A Dissimilarity d(x,y) is a function of two
    objects x and y
  • such that
  • A Metric also has to satisfy triangle inequality
  • Standard example Euclidean l2 metric
  • Assume for N objects, we compute a dissimilarity
    D
  • matrix which tells us how far they are

6
Multidimensional Scaling
  • Given dissimilarity D between original X points
    under
  • original d() metric, find Y points with
    dissimilarity D under
  • another d() metric such that D is similar to D
  • Want to find Ys that minimize some difference
    from D to D
  • Eg. Least Squares Stress
  • Eg. Invariant Stress
  • Eg. Sammon Mapping
  • Eg. Strain

Some are global Some are local Gradient descent
7
MDS Example 3D to 2D
  • Have distances from
  • cities to cities, these
  • are on the surface of
  • a sphere (Earth) in
  • 3D space
  • Reconstructed 2D
  • points on plane
  • capture essential
  • properties (poles?)

8
MDS Example Multi-D to 2D
  • More
  • elaborate
  • example
  • Have
  • correlation
  • matrix between
  • crimes. These
  • are arbitrary
  • dimensionality.
  • Hack convert
  • correlation
  • to dissimilarity
  • and show
  • reconstructed Y

9
Locally Linear Embedding
  • Instead of distance, look at neighborhood of each
    point.
  • Preserve reconstruction of point with neighbors
    in low dim
  • Find K nearest neighbors
  • for each point
  • Describe neighborhood as
  • best weights on neighbors
  • to reconstruct the point
  • Find best vectors that still
  • have same weights

Why?
10
Locally Linear Embedding
  • Finding Ws (convex combination of weights on
    neighbors)

3) Find l 4) Find w
1) Take Deriv Set to 0
2) Solve Linear system
11
Locally Linear Embedding
  • Finding Ys (new low-D points that agree with the
    Ws)
  • Solve for Y as
  • the bottom d1
  • eigenvectors of M
  • Plot the Y values

12
LLE Examples
  • Original X data are raw
  • images
  • Dots are reconstructed
  • two-dimensional Y
  • points

13
LLEs
  • TopPCA
  • BottomLLE

14
Generative Topographic Map
  • A principled altenative to the Kohonen map
  • Forms a generative
  • model of the
  • manifold. Can
  • sample it, etc.
  • Find a nonlinear
  • mapping y() from
  • a 2D grid of Gaussians.
  • Pick params W of mapping such that mapped
    Gaussians in
  • data space maximize the likelihood of the
    observed data.
  • Have two spaces, the data space t (old notation
    were Xs)
  • and the hidden latent space x (old notation
    were Ys).
  • The mapping goes from latent space to observed
    space

15
GTM as a Grid of Gaussians
  • We choose our priors and
  • conditionals for all
  • variables of
  • interest
  • Assume Gaussian
  • noise on the
  • y() mapping
  • Assume our prior latent variables are a grid
    model
  • equally spaced in latent space
  • Can now write out the full likelihood

16
GTM Distribution Model
  • Integrating over delta functions makes a
    summation
  • Note the log-sum, need to apply EM to maximize
  • Also, use the following parametric
  • (linear in the basis) form of the mapping
  • Examples of
  • manifolds for
  • randomly chosen
  • W mappings
  • Typically, we are
  • given the data and
  • want to find the maximum likelihood mapping W
    for it

17
GTM Examples
  • Recover non-linear
  • manifold by warping
  • grid with W params
  • Synthetic Example
  • Left Initialized
  • Right Converged
  • Real Example
  • Oil Data
  • 3-Classes
  • Left GTM
  • Right PCA
Write a Comment
User Comments (0)
About PowerShow.com