Online and Batch Learning of Pseudo-Metrics

About This Presentation

Title:

Online and Batch Learning of Pseudo-Metrics

Description:

1 Nearest Neighbor (1-NN) Perceptron Algorithm. Perceptron Algorithm with Uneven Margins (PAUM) ... A color-coded matrix of Euclidean distances between pairs of ... – PowerPoint PPT presentation

Number of Views:51

Avg rating:3.0/5.0

Slides: 24

Provided by: sha113

Category:

more less

Transcript and Presenter's Notes

Title: Online and Batch Learning of Pseudo-Metrics

1
Online and Batch Learning of Pseudo-Metrics

Shai Shalev-Shwartz
Hebrew University, Jerusalem
Joint work with
Yoram Singer, Google Inc.
Andrew Y. Ng, Stanford University

2
Motivating Example
3
Our Technique

Map instances into a space in which distances
correspond to labels

4
Outline

Distance learning setting
Large margin for distances
An online learning algorithm
Online loss analysis
A dual version
Experiments
Online - document filtering
Batch - handwritten digit recognition

5
Problem Setting

Training examples
two instances
similarity label
Hypotheses class Pseudo-metrics

matrix
symmetric positive semi-definite matrix
6
Large Margin for Pseudo-Metrics

Sample S is ?-separated w.r.t. a metric

7
Batch Formulation
s.t.
s.t.
8
Pseudo-metric Online Learning Algorithm (POLA)

For
Get two instances
Calculate distance
Predict
Get true label and suffer hinge-loss
Update matrix and threshold

9
Core Update Two Projections

Projection of vector v on closed convex set C
Two-step update
1) Project onto a half-space
2) Project onto the PSD cone

10
Core Update Two Projections

Start with
An example defines a half-space
is the projection of onto this
half-space
is the projection of onto the PSD
cone

PSD cone
All zero loss matrices
11
Online Learning

Goal minimize cumulative loss
Why Online?
Online processing tasks (e.g. Text Filtering)
Simple to implement
Memory and run-time efficient
Worst-case bounds on the performance
Online to batch conversions

12
Online Loss Bound

sequence of
examples s.t.
any fixed matrix and threshold
Then,
Loss bound does not depend on dimension

13
Incorporating Kernels

Matrix A can be written as
, where
Therefore

14
Online Experiments

Task Document filtering according to topics
Dataset Reuters-21578
10,000 documents
Documents labeled as Relevant and Irrelevant
A few relevant documents (1 - 10 of entire set)
Algorithms
POLA
1 Nearest Neighbor (1-NN)
Perceptron Algorithm
Perceptron Algorithm with Uneven Margins (PAUM)
(Li, Zaragoza, Herbrich, Shawe-Taylor, Kandola)

15
POLA for Document Filtering

Get a document
Calculate distance to relevant documents observed
so far using current matrix
Predict document is relevant iff the distance to
the closest relevant document is smaller than the
current threshold
Get true label
Update matrix and threshold

16
Document Filtering Results

Each blue point corresponds to one topic
Y-axis designates the error of POLA
Points beneath the black diagonal line mean that
POLA wins

17
Batch Experiments

Task Handwritten digits recognition
Dataset MNIST dataset
45 binary classification problems (all pairs)
10,000 training examples
10,000 test examples
Algorithms Used k-NN with various metrics
Pseudo-metric learned by POLA
Euclidean distance
Metric induced by Fisher Discriminant Analysis
(FDA)
Metric learned by Relevant Component Analysis
(RCA)
(Bar-Hillel, Hertz, Shental, and Weinshall)

18
MNIST Results

Each blue point corresponds to one binary
classification problem
Y-axis designates the error of POLA
Points beneath the black diagonal line mean that
POLA wins

Euclidean distance error
FDA error
RCA error
RCA was applied after using PCA as a
pre-processing step
19
Experiments Dimensionality Reduction
PCA
POLA
20
Toy problem
A color-coded matrix of Euclidean distances
between pairs of images
21
Metric found by POLA
22
Mapping found by POLA

Our Pseudo-metrics

23
Mapping found by POLA
24
Summary and Extensions

An online algorithm for learning pseudo-metrics
Formal properties, good experimental results
Extensions
Alternative regularization schemes to the
Frobenius norm
Learning to learn
Learning a metric from one set of classes and
apply to another set of related classes