Minimal Loss Hashing for Compact Binary Codes

About This Presentation

Title:

Minimal Loss Hashing for Compact Binary Codes

Description:

Minimal Loss Hashing for Compact Binary Codes Mohammad Norouzi David Fleet University of Toronto Thank you! Questions? After giving form of has function just in words ... – PowerPoint PPT presentation

Number of Views:108

Avg rating:3.0/5.0

Slides: 35

Provided by: Aid80

Learn more at: http://www.cs.toronto.edu

Category:

more less

Transcript and Presenter's Notes

Title: Minimal Loss Hashing for Compact Binary Codes

1
Minimal Loss Hashing for Compact Binary Codes

Mohammad Norouzi
David Fleet
University of Toronto

2
Near Neighbor Search
3
Near Neighbor Search
4
Near Neighbor Search
5
Similarity-Preserving Binary Hashing

Why binary codes?
Sub-linear search using hash indexing(even
exhaustive linear search is fast)
Binary codes are storage-efficient

6
Similarity-Preserving Binary Hashing
Hash function
kth row of W
Random projections used by locality-sensitive
hashing (LSH) and related techniques Indyk
Motwani 98 Charikar 02 Raginsky Lazebnik
09
7
Learning Binary Hash Functions

Reasons to learn hash functions
to find more compact binary codes
to preserve general similarity measures

Previous work
boosting Shakhnarovich et al 03
neural nets Salakhutdinov Hinton 07 Torralba
et al 07
spectral methods Weiss et al 08
loss-based methods Kulis Darrel 09

8
Formulation
9
Loss Function
Similar items should map to nearby hash
codes Dissimilar items should map to very
different codes
10
Hinge Loss
11
Empirical Loss

Good
incorporates quantization and Hamming distance

Not so good
discontinuous, non-convex objective function

12
We minimize an upper bound on empirical loss,
inspired by structural SVM formulations
Taskar et al 03 Tsochantaridis et al 04 Yu
Joachims 09
13
Bound on loss
LHS RHS
14
Bound on loss

Remarks
piecewise linear in W
convex-concave in W
relates to structural SVM with latent variables
Yu Joachims 09

15
Bound on Empirical Loss

Loss-adjusted inference
Exact
Efficient

16
Perceptron-like Learning
McAllester et al.., 2010
17
Experiment Euclidean ANN
Similarity based on Euclidean distance

Datasets
LabelMe (GIST)
MNIST (pixels)
PhotoTourism (SIFT)
Peekaboom (GIST)
Nursery (8D attributes)
10D Uniform

18
Experiment Euclidean ANN

22K LabelMe
512 GIST
20K training
2K testing
1 of pairs are similar

Evaluation
Precision hits / number of items retrieved
Recall hits / number of similar items

19
Techniques of interest

MLH minimal loss hashing (This work)
LSH locality-sensitive hashing (Charikar 02)
SH spectral hashing (Weiss, Torralba Fergus
09)
SIKH shift-Invariant kernel hashing (Raginsky
Lazebnik 09)
BRE Binary reconstructive embedding (Kulis
Darrel 09)

20
Euclidean Labelme 32 bits
21
Euclidean Labelme 32 bits
22
Euclidean Labelme 32 bits
23
Euclidean Labelme 64 bits
24
Euclidean Labelme 64 bits
25
Euclidean Labelme 128 bits
26
Euclidean Labelme 256 bits
27
Experiment Semantic ANN

Semantic similarity measure based on
annotations(object labels) from LabelMe
database
512D GIST, 20K training, 2K testing

Techniques of interest
MLH minimal loss hashing
NN nearest neighbor in GIST space
NNCA multilayer network with RBM pre-training
and nonlinear NCA fine tuning Torralba, et al.
09 Salakhutdinov Hinton 07