Learning Instance Specific Distance Using Metric Propagation presentation

About This Presentation

Transcript and Presenter's Notes

Title: Learning Instance Specific Distance Using Metric Propagation

1
Learning Instance Specific Distance Using Metric
Propagation

De-Chuan Zhan, Ming Li, Yu-Feng Li, Zhi-Hua Zhou
LAMDA Group
National Key Lab for Novel Software Technology
Nanjing University, China
zhandc, lim, liyf, zhouzh_at_lamda.nju.edu.cn

2
Distance based classification
K-nearest neighbor classification SVM with
Gaussian kernels
Is the distance reliable?
Are there any more natural measurements?
3
Any more natural measurements?
When sky is compared to other pictures
Color, probably texture features
When Phelps II is compared to other athletes
Speed of swimming, shape of feet
Can we assign a specific distance measurement for
each instance, both labeled and unlabeled?
our work
4
Outline

Introduction
Our Methods
Experiments
Conclusion

5
Introduction Distance Metric Learning

Many machine learning algorithms rely on the
distance metric for input data patterns.
Classification
Clustering
Retrieval

There are many metric learning algorithms
developed Yang, 2006
Problem Focus on learning a uniform Mahalanobis
distance for ALL instances
6
Introduction Other distance functions

Instead of applying a uniform distance metric for
every example, it is more natural to measure
distances according to specific properties of
data
Some researchers define distance from samples
own perspective

QSim Zhou and Dai, ICDM06 Athitsos et al.,
TDS07
Local distance functions Frome et al., NIPS06,
ICCV07

7
Introduction Query sensitive similarity
Actually, instance specific similarities or query
specific similarities are studied in other fields
before
In content-based image retrieval, there has been
a study which tries to compute query sensitive
similarities. The similarities among different
images are decided after receiving a query image.
Zhou and Dai, ICDM06
The problem Query similarity is based on
pure heuristics.
8
Introduction Local distance functions

Frome et al. NIPS06

The distance from the j-th instance to the i-th
instance is larger than that from the j-th to the
k-th
DjigtDjk
1. Cannot generalize directly 2.The local
distance defined is not directly comparable.

Frome et al. ICCV07

DijgtDkj
All constraints can be tired together. Requiring
more heuristics for testing.
The problem Local distance functions for
unlabeled data are N/A.
9
Introduction Our Work
Can we assign a specific distance measurement for
each instance, both labeled and unlabeled?
Yes, we learn Instance Specific Distance via
Metric Propagation
10
Outline

Introduction
Our Methods
Experiments
Conclusion

11
Our Methods Intuition

Focus on learning instance specific distance for
both labeled and unlabeled data.

For labeled data
the pair of examples come from the same class
should be closer to each other

For unlabeled data
Metric propagation on a relationship graph

12
Our Methods The ISD Framework

Instead of directly conducting metric propagation
while learning the distances for labeled
examples, we formulate the metric propagation
with a regularized framework.

The j-th instance belongs to a class other than
the i-th, or the j-th instance is a neighbor of
i-th instance, i.e., all Cannot-links and some of
the must-links are considered
13
Our Methods The ISD Framework relationship to
FSM
Although only pair-wised side information is
investigated in our work, the ISD Framework is a
common frame
FSM Frome et al. NIPS06 is a special case of
ISD
14
Our Methods The ISD Framework update graph
15
Our Methods ISD with L1-loss
Convex problem ? we employ the alternating
descent method to solve it, i.e. to sequentially
solve one w for one instance at each time by
fixing other ws till converges or maxiters
reached.
16
Our Methods ISD with L1-loss (cont)
Primal
17
Our Methods Acceleration ISD with L2-loss

For acceleration
The alternating descent method is used to solve
the problem
Reduce the number of constraints by considering
some must-
links

However, the number of inequality constraints may
be large
18
Our Methods Acceleration ISD with L2-loss
19
Outline

Introduction
Our Methods
Experiments
Conclusion

20
Experiments Configurations

Data sets
15 UCI data sets
COREL image dataset (20 classes, 100
images/class)
2/3 labeled training set 1/3 unlabeled for
testing, 30 runs
Compared methods
ISD-L1/L2
FSM/FSSM (Frome et al. 2006 2007)
LMNN (Weinberger et al. 2005)
DNE (Zhang et al, 2007)
Parameters are selected via cross validation

21
Experiments Classification Performance
Comparison of test error rates (meanstd.)
22
Experiments Influence of the number of iteration
rounds
Updating rounds Starting from Euclidean
The error rates of ISD-L1 are reduced on most
datasets as the number of update increasing
The error rates of ISD-L2 reduce on some datasets.
The error rates of ISD-L2 reduce on some
datasets. However, on others, the performance are
degenerated. Overfitting L2-loss is more
sensitive to noise
23
Experiments Influence of the amount of labeled
data
ISD is less sensitive to the influence of the
amount of labeled data
When the amount of labeled samples is limited,
the superiority of ISD is more apparent
24
Conclusion

Main contribution
A method for learning instance-specific distance
for labeled as well as unlabeled instances.
Future work
The construction of the initial graph
Label propagation, metric propagation, any more
properties to propagate?

Thanks!

Write a Comment

User Comments (0)

About PowerShow.com

Learning Instance Specific Distance Using Metric Propagation PowerPoint PPT Presentation