Title: A Geometric Perspective on Machine Learning
1A Geometric Perspective on Machine Learning
2Machine Learning the problem
Information (training data)
f
???
X and Y are usually considered as a Euclidean
spaces.
f X?Y
3Manifold Learning geometric perspective
- The data space may not be a Euclidean space, but
a nonlinear manifold.
4Manifold Learning the challenges
- The manifold is unknown! We have only samples!
- How do we know M is a sphere or a torus, or else?
- How to compute the distance on M?
- versus
This is what we have
This is unknown
?
?
Topology
or else?
Geometry
Functional analysis
5Manifold Learning current solution
- Find a Euclidean embedding, and then perform
traditional learning algorithms in the Euclidean
space.
6Simplicity
7Simplicity
8Simplicity is relative
9Manifold-based Dimensionality Reduction
- Given high dimensional data sampled from a low
dimensional manifold, how to compute a faithful
embedding? - How to find the mapping function ?
- How to efficiently find the projective function
?
10A Good Mapping Function
- If xi and xj are close to each other, we hope
f(xi) and f(xj) preserve the local structure
(distance, similarity ) - k-nearest neighbor graph
- Objective function
- Different algorithms have different concerns
11Locality Preserving Projections
Principle if xi and xj are close, then their
maps yi and yj are also close.
12Locality Preserving Projections
Principle if xi and xj are close, then their
maps yi and yj are also close.
Mathematical formulation minimize the integral
of the gradient of f.
13Locality Preserving Projections
Principle if xi and xj are close, then their
maps yi and yj are also close.
Mathematical formulation minimize the integral
of the gradient of f.
Stokes Theorem
14Locality Preserving Projections
Principle if xi and xj are close, then their
maps yi and yj are also close.
Mathematical formulation minimize the integral
of the gradient of f.
Stokes Theorem
LPP finds a linear approximation to nonlinear
manifold, while preserving the local geometric
structure.
15Manifold of Face Images
Pose (Right gtgtgt Left)
Expression (Sad gtgtgt Happy)
16Manifold of Handwritten Digits
Slant
Thickness
17Active and Semi-Supervised Learning A Geometric
Perspective
- Learning target
- Training Examples
- Linear Regression Model
18Generalization Error
- Goal of Regression
-
- Obtain a learned function that
minimizes the generalization error (expected
error for unseen test input points). - Maximum Likelihood Estimate
19Gauss-Markov Theorem
For a given x, the expected prediction error is
20Gauss-Markov Theorem
For a given x, the expected prediction error is
Good!
Bad!
21Experimental Design Methods
- Three most common scalar measures of the size of
the parameter (w) covariance matrix - A-optimal Design determinant of Cov(w).
- D-optimal Design trace of Cov(w).
- E-optimal Design maximum eigenvalue of Cov(w).
- Disadvantage these methods fail to take into
account unmeasured (unlabeled) data points.
22Manifold Regularization Semi-Supervised Setting
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
?
23Manifold Regularization Semi-Supervised Setting
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
?
random labeling
24Manifold Regularization Semi-Supervised Setting
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
?
active learning semi-supervsed learning
random labeling
active learning
25Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
26Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
27Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
Compute nearest neighbor graph G
28Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
Compute nearest neighbor graph G
29Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
Compute nearest neighbor graph G
30Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
Compute nearest neighbor graph G
31Unlabeled Data to Estimate Geometry
- Measured (labeled) points discriminant structure
- Unmeasured (unlabeled) points geometrical
structure
Compute nearest neighbor graph G
32Laplacian Regularized Least Square (Belkin and
Niyogi, 2006)
- Linear objective function
- Solution
33Active Learning
How to find the most representative points on the
manifold?
34Active Learning
- Objective Guide the selection of the subset of
data points that gives the most amount of
information. - Experimental design select samples to label
- Manifold Regularized Experimental Design
- Share the same objective function as Laplacian
Regularized Least Squares, simultaneously
minimize the least square error on the measured
samples and preserve the local geometrical
structure of the data space.
35Analysis of Bias and Variance
-
- ,
- In order to make the estimator as stable as
possible, the size of the covariance matrix
should be as small as possible. - D-optimality minimize the determinant of the
covariance matrix
36The algorithm
- Select the first data point such that
is maximized, - Suppose k points have been selected, choose the
(k1)th point such that
. - Update
Manifold Regularized Experimental Design Where are selected from
37Nonlinear Generalization in RKHS
- Consider feature space F induced by some
nonlinear mapping f, and lt f(xi), f(xj) gtK(xi,
xi). - K(, ) positive semi-definite kernel function
- Regression model in RKHS
- Objective function in RKHS
38Nonlinear Generalization in RKHS
- Select the first data point such that
is maximized, - Suppose k points have been selected, choose the
(k1)th point such that
. - Update
Kernel Graph Regularized Experimental Design where are selected from
39A Synthetic Example
Laplacian Regularized Optimal Design
A-optimal Design
40A Synthetic Example
Laplacian Regularized Optimal Design
A-optimal Design
41Application to image/video compression
42Video compression
43Topology
Can we always map a manifold to a Euclidean space
without changing its topology?
?
44Topology
Homotopy
Simplicial Complex
Good Cover
Sample Points
Homology Group
Betti Numbers
Euler Characteristic
Number of components, dimension,
45Topology
The Euler Characteristic is a topological
invariant, a number that describes one aspect of
a topological spaces shape or structure.
1
0
1
2
0
0
-2
The Euler Characteristic of Euclidean space is 1!
46Challenges
- Insufficient sample points
- Choose suitable radius
- How to identify noisy holes (user interaction?)
Noisy hole
homotopy
homeomorphsim
47