Evaluation of Distance Metrics for Recognition Based on presentation

About This Presentation

Transcript and Presenter's Notes

Title: Evaluation of Distance Metrics for Recognition Based on

1
Evaluation of Distance Metrics for Recognition
Based on Non-Negative Matrix Factorization

David Guillamet, Jordi Vitrià
Pattern Recognition Letters
241599-1605, June, 2003
John Galeotti
Advanced Perception
March 23, 2004

2
Actually, Two ICPR02 Papers

Analyzing Non-Negative Matrix Factorization for
Image Classification
David Guillamet, Bernt Schiele, Jordi Vitrià
Determining a Suitable Metric When using
Non-negative Matrix Factorization
David Guillamet, Jordi Vitrià

3
Non-Negative Matrix Factorization

TLA NMF
Used for dimensionality reduction
Vnxm WnxrHrxm, r lt nm/(nm)
V has non-negative training samples as its
columns
W contains the non-negative basis vectors
H contains the non-negative coefficients to
approximate each column of V using W
Results similar in concept to PCA, but with
non-negative basis vectors

4
NMF Distinguishing Properties

Requires positive data
Computationally expensive
Part-based decomposition
Because only additive combinations of original
data are allowed
Not an orthonormal basis

5
Different Decomposition Types

20 Dimensions of Numeric Digits
PCA NMF
50 Dimensions of Numeric Digits
PCA NMF

6
Why not just use PCA?

PCA is optimal for reconstruction
PCA is not optimal for separation and recognition
of classes

7
NMF Issues Addressed

If/when is NMF better at dimensionality reduction
than PCA for classification?
Can combining PCA and NMF lead to better
performance?
What is the best distance metric to use with the
nonorthonormal basis of NMF?

8
How NMF Works

Vnxm WnxrHrxm, r lt nm/(nm)
Begin with a nxm matrix of training data V
Each column is a vectorized data point
Randomly initialize W and H with positive values
Iterate according to update rules

9
How NMF Works

In general, NMF requires the non-linear
optimization of an objective function
The update rules just given correspond to a
popular objective function, and are guaranteed to
converge.
That objective function relates to the
probability of generating the images in V from
the bases W and encodings H

10
NMF vs. PCA Experiments

Dataset 10 classes of natural textures
Clouds, grass, ice, trees, sand, sky, etc.
932 color images total
Each image tessellated into 10x10 patches
1000 patches for training, 1000 for testing
Each patch classified as a single texture
Raw feature vectors Color histograms
Each region histogrammed into 8 bins per color,
16 colors ? 512 dimensional vectors

11
NMF vs. PCA Experiments

Learn both NMF and PCA subspaces for each class
of histogram
For both NMF and PCA
Project queries onto the learned subspaces of
each class
Label each query by the subspace that best
reconstructs the query
This seems like a poor scheme for NMF
(Other experiments allow better schemes)

12
NMF vs. PCA Results

NMF works best for dispersed classes
PCA works best for compact classes
Both seem usefultry combining them
But, why are less than half of the sky vectors
best reconstructed by PCA when for sky PCA has a
mean reconstruction error less than 1/4 that of
NMF? Mistakes?

13
NMFPCA Experiments

During training, we learned whether NMF or PCA
worked best for each class
Project a query to a class using only the method
that works best for that class
Result 2.3 improvement in the recognition rate
over NMF alone (PCA 5.8), but is this
significant at 60?

14
Hierarchy Experiments

At level k of the hierarchy, project the query
onto each original class NMF or PCA subspace
But, to choose the direction to descend the
hierarchy, we only care about the level k
super-class containing the matching class
Furthermore, for each class the choice of PCA vs.
NMF can be independently set at each level of the
hierarchy

15
Hierarchy Results

2 improvement in recognition rate
I really suspect that this is insignificant, and
resulting only from the additional degrees of
freedom
They employ various additional neighborhood-based
hacks to increase their accuracy further, but I
dont see any relevance to NMF specifically

16
Need for a better metric

Want to classify based on nearest neighbor,
rather than reprojection error
Unfortunately, NMF generates a nonorthonormal
basis, and so the relative distance to a base
depends on the uniqueness of that base
Bases will share a lot of pixels in common areas

17
Earth Movers Distance (EMD)

Defined as the minimal amount of work that must
be performed to transform one feature
distribution into the other
A special case of the transportation problem
from linear optimization
Let Iset of suppliers, Jset of consumers,
cijcost to ship from I to J, fijamount shipped
from I to J
Distance cost to make datasets equal

18
Earth Movers Distance (EMD)

Based on finding a measure of correlation between
bases to define its cost matrix
The cost matrix weights the transition of one
basis (bi) to another (bj)
cij distangle(bi,bj) -( x y )/( x y
)

19
EMD Transportation Problem

fij quant.
shipped from i?j
Consumers
dont ship
Dont exceed
demand
Dont exceed
supply
Demand must equal supply for EMD to be a metric

20
EMD vs. Other Experiments

Digit recognition from MNIST digit database
60,000 training images 10,000 for test
Classify by NN and 5NN in the subspace
Result EMD works best in low-dimensional
subspaces, but in high-dimensional subspaces EMD
does not work well
More specificly, EMD works well when the bases
contain some intersecting pixels

21
Occlusion Experiments

Randomly occlude either 1 or 2 of the 4 quadrants
of an image (25 and 50 occlusion)
Why does distangle do so well?

Best subspace distance with occlusions Best subspace distance with occlusions Best subspace distance with occlusions
Low dim. High dim.
25 Occlusion NMFdistangle PCA sometimes better
50 Occlusion NMFdistangle OR EMD NMFdistangle
22
Demo

NMF difficulties
EMD experiments instead
Demonstrate using existing code within the
desired framework of a cost matrix
Their code http//robotics.stanford.edu/rubner/
emd/default.htm
My code http//www.vialab.org/john/Pres9-code/

23
Conclusion

NMF is a parts-based alternative to PCA
NMF and PCA should be combined for
minimum-reprojection-error classification
For nearest-neighbor classification, NMF needs a
better metric
When the subspace dimensionality is chosen
appropriately for good bases, NMFEMD or
NMFdistangle have the highest recognition rates

Write a Comment

User Comments (0)

About PowerShow.com

Evaluation of Distance Metrics for Recognition Based on PowerPoint PPT Presentation