Using Manifold Structure for Partially Labeled Classification

About This Presentation

Title:

Using Manifold Structure for Partially Labeled Classification

Description:

by Belkin and Niyogi, NIPS 2002. Outline. Motivations. Algorithm Description ... Data lies on a lower-dimensional manifold a dimension reduction is preferable ... – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 15

Provided by: hui4

Learn more at: http://people.ee.duke.edu

Category:

more less

Transcript and Presenter's Notes

Title: Using Manifold Structure for Partially Labeled Classification

1
Using Manifold Structure for Partially Labeled
Classification
by Belkin and Niyogi, NIPS 2002
Presented by Chunping Wang Machine Learning
Group, Duke University November 16, 2007
2
Outline

Motivations
Algorithm Description
Theoretical Interpretation
Experimental Results
Comments

3
Motivations (1)

Why manifold structure is useful?
Data lies on a lower-dimensional manifold a
dimension reduction is preferable
an example a handwritten digit 0

Usually, dimensionality is the number of pixels,
typically very high (256)
4
Motivations (1)

Why manifold structure is useful?
Data lies on a lower-dimensional manifold a
dimension reduction is preferable
an example a handwritten digit 0

Usually, dimensionality is the number of pixels,
typically very high (256)
d1
Ideally, 5-dimensional features
f1

d2
f2

5
Motivations (1)

Why manifold structure is useful?
Data lies on a lower-dimensional manifold a
dimension reduction is preferable
an example a handwritten digit 0

Actually, a higher dimensionality, but perhaps no
more than several dozens
Usually, dimensionality is the number of pixels,
typically far higher (256)
d1
Ideally, 5-dimensional features
f1

d2
f2

6
Motivations (2)

Why manifold structure is useful?
Data representation in the original space is
unsatisfactory

labeled
unlabeled
In the original space
2-d representation with Laplacian Eigenmaps
7
Algorithm Description (1)
Semi-supervised classification
k points
First s are labeled (sltk)
for binary cases

Constructing the Adjacency Graph
if i is among n nearest neighbors of j or j is
among n nearest neighbors of i
Eigenfunctions
compute , corresponding to
the p smallest eigenvelues
for the graph Laplacian L D-W,

8
Algorithm Description (2)
Semi-supervised classification
k points
First s are labeled (sltk)
for binary cases

Building the classifier
minimize the error function
over the
space of coefficients a
the solution is
Classifying unlabeled points (i gts)

9
Theoretical Interpretation (1)
For a manifold , the
eigenfunctions of its Laplacian form a basis for
the Hilbert space , i.e., any
function can be written as
with eigenfunctions satisfying
The simplest nontrivial example the manifold is
a unit circle S1
Fourier series
10
Theoretical Interpretation (2)
Smoothness measure S a small S means smooth
For unit circle S1
Generally
Smaller eigenvalues correspond to smoother
eigenfunctions (lower frequency)
is a constant function
In terms of the smoothest p eigenfunctions, the
approximation of an arbitrary function
11
Theoretical Interpretation (3)
Back to our problem with finite number of points
The solution of a discrete version
For binary classification, the alphabet of the
function f only contains two possible values. For
M-ary cases, the only difference is the number of
possible values is more than two.
12
Results (1)
Handwritten Digit Recognition (MNIST data set)
60,000 28-by-28 gray images (the first 100
principal components are used)
p20 k
13
Results (2)
Text Classification (20 Newsgroups data set)
19,935 vectors with dimensionality of 6000
p20 k
14
Comments

This semi-supervised algorithm essentially
converts the original problem to a linear
regression problem in a new space with lower
dimensionality.
The approach to solve this linear regression
problem is the standard least square estimation.
Only n nearest neighbors are considered for
each data point, thus the computation for
eigen-decomposition is reduced.
Little additional computation is expended after
dimensionality reduction.
More comments