Diffusion%20Maps%20and%20Spectral%20Clustering - PowerPoint PPT Presentation

About This Presentation
Title:

Diffusion%20Maps%20and%20Spectral%20Clustering

Description:

Title: Diffusion Maps and Spectral Clustering Created Date: 6/1/2006 6:13:16 PM Document presentation format: On-screen Show Company: Signal Innovations – PowerPoint PPT presentation

Number of Views:121
Avg rating:3.0/5.0
Slides: 15
Provided by: dukeEdu7
Category:

less

Transcript and Presenter's Notes

Title: Diffusion%20Maps%20and%20Spectral%20Clustering


1
Diffusion Maps and Spectral Clustering
1/14
Machine Learning Seminar Series
  • Author Ronald R. Coifman et al. (Yale
    University)
  • Presenter Nilanjan Dasgupta (SIG Inc.)

2
Motivation
2/14
-- Datum
Low-dimensional Manifold
  • Data lie on a low-dimensional manifold. The
    shape of the
  • manifold is not known a priori.
  • PCA would fail to make compact representation
    since the
  • manifold is not linear !
  • Spectral clustering as a non-linear
    dimensionality reduction
  • scheme.

3
Outline
3/14
  • Non-linear dimensionality reduction and
    spectral clustering.
  • Diffusion based probabilistic interpretation of
    spectral methods.
  • Eigenvectors of normalized graph Laplacian is a
    discrete
  • approximation of the continuous
    Fokker-Plank operator.
  • Justification of the success of spectral
    clustering.
  • Conclusions.

4
Spectral clustering
4/14
  • Nomalized graph Laplacian
  • Given N data points where each
    , the distance
  • (similarity) between any two points xi and
    xj is given by
  • with Gaussian kernel of
    width e
  • and a diagonal normalization matrix
  • Solve the normalized eigenvalue problem
  • Use first few eigenvectors of M for
    low-dimensional
  • representation of data or good
    coordinates for clustering.

5
Spectral Clustering previous work
5/14
  • Non-linear dimensionality analysis by S. Roweis
    and L.Saul
  • (published in Science magazine, 2000).
  • Belkin Niyogi (NIPS02) show that if data are
    sampled uniformly
  • from the low-dimensional manifold, first
    few eigenvectors of
  • MD-1L are discrete approximation of the
    Laplace-Beltrami
  • operator on the manifold.
  • Meila Shi (AIStat01) interpret M as a
    stochastic matrix
  • representing random walk on the graph.

6
Diffusion distance and Diffusion map
6/14
  • A symmetric matrix Ms can be derived from M as
  • M and Ms has same N eigenvalues,
  • Under random walk representation of the graph M

f left eigenvector of M y right eigenvector
of M
e time step
7
Diffusion distance and Diffusion map
7/14
  • e has the dual representation (time step and
    kernel width).
  • If one starts random walk from location xi , the
    probability of
  • landing in location y after r time steps
    is given by
  • For large e, all points in the graph are
    connected (Mi,j gt0) and
  • the eigenvalues of M

where ei is a row vector with all zeros except
that ith position 1.
8
Diffusion distance and Diffusion map
8/14
  • One can show that regardless of starting point
    xi

Left eigenvector of M with eigenvalue l01
with
  • Eigenvector f0(x) has the dual representation
  • 1. Stationary probability distribution on
    the curve, i.e., the
  • probability of landing at location x
    after taking infinite
  • steps of random walk (independent of the start
    location).
  • 2. It is the density estimate at location
    x.

9
Diffusion distance
9/14
  • For any finite time r,
  • yk and fk are the right and left eigenvectors
    of graph Laplacian M.
  • is the kth eigenvalue of M r (arranged in
    descending order).
  • Given the definition of random walk, we denote
    Diffusion
  • distance as a distance measure at time t
    between two pmfs as

with empirical choice w(y)1/f0(y).
10
Diffusion Map
10/14
  • Diffusion distance
  • Diffusion map Mapping between original space
    and first
  • k eigenvectors as

Relationship
  • This relationship justifies using Euclidean
    distance in diffusion
  • map space for spectral clustering.
  • Since , it is justified
    to stop at appropriate k with
  • a negligible error of order O(lk1/lk)t).

11
Asymptotics of Diffusion Map
11/14
  • Suppose xi are sampled i.i.d. from
    probability density p(x)
  • defined over manifold

Z
  • Suppose p(x) e-U(x) with U(x) is potential
  • (energy) at location x.
  • As , random walk on a discrete graph
  • converges to random walk on the continuous
    manifold W.
  • The forward and backward operators are
    given by

12
Asymptotics of Diffusion Map
12/14
  • Tff the probability distribution after one
    time-step e
  • f(x) is probability distribution on the graph
    at t0.
  • Tby(x) is the mean of function y after one
    time-step e, for a random walk
  • that started at location x at time t0.
  • Consider the limit , i.e., when
  • each data point contains infinite nearby
    neighbors. Hence
  • in that limit, random walk converges to a
    diffusion process
  • with probability density evolving
    continuously in time as

13
Fokker-Plank operator
13/14
  • Infinitesimal generators (propagators)
  • The eigenfunctions of Tf and Tb converge to
    those of Hf and Hb, respectively.
  • The backward generator is given by the Fokker
    Plank operator

which corresponds to a diffusion process in a
potential field 2U(x).
14
Spectral clustering and Fokker-Plank operator
14/14
  • The term is interpreted as the
    drift term towards
  • low potential (higher data density).
  • The left and right eigenvectors of M can be
    viewed as discrete
  • approximations of Tf and Tb, respectively.
  • Tf and Tb can be viewed as approximation to Hf
    and Hb, which
  • in the asymptotic case ( ) can be
    viewed as diffusion
  • process with potential 2U(x)
    (p(x)exp(-U(x)).
Write a Comment
User Comments (0)
About PowerShow.com