Title: Distance Metric Learning with Spectral Clustering
1Distance Metric Learning with Spectral Clustering
2Spectral Clustering
- Based on the MinCut Problem
- Cuts deal with pairwise similarity measures, and
thus can capture non-linear relationships.
3Spectral Clustering Cont.
sigma .1
4Spectral Clustering Cont.
sigma .1
5Spectral Clustering Cont.
6Motivation
- Finding the sigma parameter automatically and
optimally will give us better clustering of the
data. - It is hard to formulate a RBF distance metric
such that the sigma is easily isolatable. - Mahalanobis Distance?
7Defining a Distance Metric
1) The distance metric must represent
similarities between data points. 2) Commonly
RBF Kernels are used as distance metrics in SC.
3) The Mahalanobis Distance must form positive
distance values representing similarity, NOT
dissimilarity.
8The MinCut Problem
The about eqn is subject to constraints yi must
take on discrete binary values, and yTD1 0. If
y is relaxed to take on real values, this
minimization is equivalent to the eigenvalue
system
This eqn is easily shown by substituting z
D1/2y. z0 D1/21 is an eigenvector, with
eigenvalue 0.
9Minimizing Eigenvectors
D-1/2(D-W)D-1/2 is a symmetric semi-positive
definite matrix because (D-W) (also known as the
Laplacian Matrix) is known to be symmetric
semi-positive definite. z0 is the smallest
eigenvector of D-1/2(D-W)D-1/2, and all other
eigenvectors are perpendicular to it. z1, the
second smallest eigenvector has the property
z1Tz0 0 y1TD1
10Minimizing Eigenvectors Cont.
- Thus we obtain
- arg.min zTz0 0 zTD-1/2(D-W)D-1/2z
- zTz
- and equivalently
- arg.min yTD1 0 yT(D-W)y
- yTDy
- Minimizing the second smallest eigenvector
solution of this equation is guaranteed to give
us the normalized cut solution with the second
constraint satisfied.
11Trace SDP
Given that the lambdas are eigenvector solutions
to matrix K, we see that minimizing over the
lambdas is equivalent to minimizing over the
tr(KB). Minimizing our second eigenvector can be
rewritten as a Procrustes Problem. B a weighted
outer product of the eigenvectors such that the
eigenvectors are normalized, and the weights are
in strictly increasing order.
12Trace SDP Cont.
- Because we know that we want to only minimize our
second smallest and smallest eigenvector, we can
set alpha n and n-1 to 1, and the rest to 0.
13Solving the SDP
14The K Matrix
15Solving the SDP Cont.
16Solving the SDP
17Some Results (more coming)
18More results
19Conclusions
- Unclear as of right now, whether linear
transformations help clustering. - More interesting Distance Metrics