Evaluation of Distance Metrics for Recognition Based on - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Evaluation of Distance Metrics for Recognition Based on

Description:

Evaluation of Distance Metrics for Recognition Based on Non-Negative ... Randomly occlude either 1 or 2 of the 4 quadrants of an image (25% and 50% occlusion) ... – PowerPoint PPT presentation

Number of Views:90
Avg rating:3.0/5.0
Slides: 24
Provided by: johnmga8
Category:

less

Transcript and Presenter's Notes

Title: Evaluation of Distance Metrics for Recognition Based on


1
Evaluation of Distance Metrics for Recognition
Based on  Non-Negative Matrix Factorization
  • David Guillamet, Jordi Vitrià
  • Pattern Recognition Letters
  • 241599-1605, June, 2003
  • John Galeotti
  • Advanced Perception
  • March 23, 2004

2
Actually, Two ICPR02 Papers
  • Analyzing Non-Negative Matrix Factorization for
    Image Classification
  • David Guillamet, Bernt Schiele, Jordi Vitrià
  • Determining a Suitable Metric When using
    Non-negative Matrix Factorization
  • David Guillamet, Jordi Vitrià

3
Non-Negative Matrix Factorization
  • TLA NMF
  • Used for dimensionality reduction
  • Vnxm WnxrHrxm, r lt nm/(nm)
  • V has non-negative training samples as its
    columns
  • W contains the non-negative basis vectors
  • H contains the non-negative coefficients to
    approximate each column of V using W
  • Results similar in concept to PCA, but with
    non-negative basis vectors

4
NMF Distinguishing Properties
  • Requires positive data
  • Computationally expensive
  • Part-based decomposition
  • Because only additive combinations of original
    data are allowed
  • Not an orthonormal basis

5
Different Decomposition Types
  • 20 Dimensions of Numeric Digits
  • PCA NMF
  • 50 Dimensions of Numeric Digits
  • PCA NMF

6
Why not just use PCA?
  • PCA is optimal for reconstruction
  • PCA is not optimal for separation and recognition
    of classes

7
NMF Issues Addressed
  • If/when is NMF better at dimensionality reduction
    than PCA for classification?
  • Can combining PCA and NMF lead to better
    performance?
  • What is the best distance metric to use with the
    nonorthonormal basis of NMF?

8
How NMF Works
  • Vnxm WnxrHrxm, r lt nm/(nm)
  • Begin with a nxm matrix of training data V
  • Each column is a vectorized data point
  • Randomly initialize W and H with positive values
  • Iterate according to update rules

9
How NMF Works
  • In general, NMF requires the non-linear
    optimization of an objective function
  • The update rules just given correspond to a
    popular objective function, and are guaranteed to
    converge.
  • That objective function relates to the
    probability of generating the images in V from
    the bases W and encodings H

10
NMF vs. PCA Experiments
  • Dataset 10 classes of natural textures
  • Clouds, grass, ice, trees, sand, sky, etc.
  • 932 color images total
  • Each image tessellated into 10x10 patches
  • 1000 patches for training, 1000 for testing
  • Each patch classified as a single texture
  • Raw feature vectors Color histograms
  • Each region histogrammed into 8 bins per color,
    16 colors ? 512 dimensional vectors

11
NMF vs. PCA Experiments
  • Learn both NMF and PCA subspaces for each class
    of histogram
  • For both NMF and PCA
  • Project queries onto the learned subspaces of
    each class
  • Label each query by the subspace that best
    reconstructs the query
  • This seems like a poor scheme for NMF
  • (Other experiments allow better schemes)

12
NMF vs. PCA Results
  • NMF works best for dispersed classes
  • PCA works best for compact classes
  • Both seem usefultry combining them
  • But, why are less than half of the sky vectors
    best reconstructed by PCA when for sky PCA has a
    mean reconstruction error less than 1/4 that of
    NMF? Mistakes?

13
NMFPCA Experiments
  • During training, we learned whether NMF or PCA
    worked best for each class
  • Project a query to a class using only the method
    that works best for that class
  • Result 2.3 improvement in the recognition rate
    over NMF alone (PCA 5.8), but is this
    significant at 60?

14
Hierarchy Experiments
  • At level k of the hierarchy, project the query
    onto each original class NMF or PCA subspace
  • But, to choose the direction to descend the
    hierarchy, we only care about the level k
    super-class containing the matching class
  • Furthermore, for each class the choice of PCA vs.
    NMF can be independently set at each level of the
    hierarchy

15
Hierarchy Results
  • 2 improvement in recognition rate
  • I really suspect that this is insignificant, and
    resulting only from the additional degrees of
    freedom
  • They employ various additional neighborhood-based
    hacks to increase their accuracy further, but I
    dont see any relevance to NMF specifically

16
Need for a better metric
  • Want to classify based on nearest neighbor,
    rather than reprojection error
  • Unfortunately, NMF generates a nonorthonormal
    basis, and so the relative distance to a base
    depends on the uniqueness of that base
  • Bases will share a lot of pixels in common areas

17
Earth Movers Distance (EMD)
  • Defined as the minimal amount of work that must
    be performed to transform one feature
    distribution into the other
  • A special case of the transportation problem
    from linear optimization
  • Let Iset of suppliers, Jset of consumers,
    cijcost to ship from I to J, fijamount shipped
    from I to J
  • Distance cost to make datasets equal

18
Earth Movers Distance (EMD)
  • Based on finding a measure of correlation between
    bases to define its cost matrix
  • The cost matrix weights the transition of one
    basis (bi) to another (bj)
  • cij distangle(bi,bj) -( x y )/( x y
    )

19
EMD Transportation Problem
  • fij quant.
    shipped from i?j
  • Consumers
    dont ship
  • Dont exceed
    demand
  • Dont exceed
    supply
  • Demand must equal supply for EMD to be a metric

20
EMD vs. Other Experiments
  • Digit recognition from MNIST digit database
  • 60,000 training images 10,000 for test
  • Classify by NN and 5NN in the subspace
  • Result EMD works best in low-dimensional
    subspaces, but in high-dimensional subspaces EMD
    does not work well
  • More specificly, EMD works well when the bases
    contain some intersecting pixels

21
Occlusion Experiments
  • Randomly occlude either 1 or 2 of the 4 quadrants
    of an image (25 and 50 occlusion)
  • Why does distangle do so well?

Best subspace distance with occlusions Best subspace distance with occlusions Best subspace distance with occlusions
Low dim. High dim.
25 Occlusion NMFdistangle PCA sometimes better
50 Occlusion NMFdistangle OR EMD NMFdistangle
22
Demo
  • NMF difficulties
  • EMD experiments instead
  • Demonstrate using existing code within the
    desired framework of a cost matrix
  • Their code http//robotics.stanford.edu/rubner/
    emd/default.htm
  • My code http//www.vialab.org/john/Pres9-code/

23
Conclusion
  • NMF is a parts-based alternative to PCA
  • NMF and PCA should be combined for
    minimum-reprojection-error classification
  • For nearest-neighbor classification, NMF needs a
    better metric
  • When the subspace dimensionality is chosen
    appropriately for good bases, NMFEMD or
    NMFdistangle have the highest recognition rates
Write a Comment
User Comments (0)
About PowerShow.com