Discriminant Adaptive Nearest Neighbor Classification - PowerPoint PPT Presentation

1 / 21

About This Presentation

Title:

Discriminant Adaptive Nearest Neighbor Classification

Description:

The modified neighborhood extends further parallel to the decision boundaries ... DANN neighborhoods. DANN Classifier. Initialize the metric S = I ... – PowerPoint PPT presentation

Number of Views:196

Avg rating:3.0/5.0

Slides: 22

Provided by: thomas343

Category:

more less

Transcript and Presenter's Notes

Title: Discriminant Adaptive Nearest Neighbor Classification

1
Discriminant Adaptive Nearest Neighbor
ClassificationDistance metric learning, with
application to clustering with side-information

02/15/04
Thomas DSilva

2
k-NN classification

Given n training pairs with
and denoting class membership.
Given new x0, predict class y0
Find K training points x(i) closest in distance
to x0
Classify using majority vote

3
Radius of NN neighborhood

N data points uniformly distributed in a unit
cube -1/2 , 1/2d let R be the radius of a 1
nearest neighbor centered at the origin. vdrd is
the volume of sphere with radius r in d
dimensions

4
Solution

Nearest neighbor techniques are based on the
assumption that locally the class posterior
probabilities P(jx) are approximately constant.
In high-dimensions, nearest neighbors are far
away causing bias and degrading performance.
Adapt metric used in k-NN, so that resulting
neighborhoods stretch out in directions in which
the class probabilities change the least

5
Discriminant Adaptive NN

Two classes in two dimensions, Class 1 almost
completely surrounds Class 2.
The modified neighborhood extends further
parallel to the decision boundaries and shrinks
the neighborhood in the direction orthogonal to
the decision boundary.

6
DANN metric

The metric S is defined by
W (within-class covariance matrix)
B
between class class covariance matrix

7
DANN neighborhoods
8
DANN Classifier

Initialize the metric S I
Spread out the nearest neighborhood of KM points
around the test point x0, in the metric S.
Calculate the weighted between and within
sum-of-square matrices W and B using the points
in the neighborhood.
Define a new metric
Iterate steps 2,3 and 4
After completion using S for K-NN classification
at the test point x0.

9
Parameters

Km should be relatively large min (50,n/5)
K should be around 5
1 gave good results

10
Global Dimension Reduction

For the local neighborhood N(i) of xi, the local
class centroids are contained in a subspace
useful for classification.
At each training point xi, the between-centroids
sum of square matrix Bi is computed, and then
these matrices are averaged over all training
points
The eigenvectors e1, e2, ep of the matrix
span the optimal subspaces for global subspace
reduction.

11
Global Dimension Reduction

Eigenvalues of for a two class, 4
dimensional sphere model with 6 noise dimensions
Decision boundary is a 4 dimensional sphere.

12
Global Dimension Reduction

Two dimensional Gaussian data with two classes
(substantial within class covariance).

13
Distance metric learning

Datamining algorithms require good metrics that
reflect the important relations between the data
If a user indicates certain points in input space
are similar can, can we learn a metric that
assigns small distances between similar pairs
Can be used in preprocessing step to help
unsupervised algorithms find better solutions.

14
Distance metric learning, with application to
clustering with side-informationE.P. Xing,
A.Y. Ng, M.I. Jordan and S. Russell
15
Learning Distance Metrics

Given set S
Consider distance metric
Positive semi-definite (to satisfy triangle
inequality)
d(x,y)0 does not imply x y
If AI Euclidean Distance
If ADiagonal Mahalanobis Distances
Equals rescaling datapoint x to A1/2x and
applying standard Euclidean metric to the
rescaled data.

16
Learning the Metric

S set of similar points, D set of dissimilar
points
To learn diagonal A, use Newton-Rhapson method to
minimize
To learn full A use gradient ascent and iterative
analysis

17
Experiments

Center and left panels represent rescaled data
(diagonal A and full A) x -gt A1/2x

18
K-means Clustering

Learn metric using side information and use the
metric to cluster data
K-means using Euclidean metric
Constrained K-means subject to points always
being assigned to the same cluster
K-means metric K-means with distortion defined
using the learned distance metric
Constrained K-means using the learned distance
metric

19
Clustering Results