Sharing features for multiclass object detection - PowerPoint PPT Presentation

1 / 64
About This Presentation
Title:

Sharing features for multiclass object detection

Description:

Sharing features for. multi ... to improve generalization, share features. ... variation on boosting that allows for sharing features in a natural way. ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 65
Provided by: vascR
Category:

less

Transcript and Presenter's Notes

Title: Sharing features for multiclass object detection


1
Sharing features for multi-class object detection
  • Antonio Torralba, Kevin Murphy and Bill Freeman
  • MIT
  • Computer Science and Artificial Intelligence
    Laboratory (CSAIL)
  • Oct. 10, 2004

2
Antonio Torralba
Kevin Murphy
3
Goal
  • We want a machine to be able to identify
    thousands of different objects as it looks around
    the world.

4
Multi-class object detection
5
Where is the field?
But the problem of multi-class and multi-view
object detection is still largely unsolved.
6
Why multi-object detection is a hard problem
Styles, lighting conditions, etc, etc, etc
Need to detect Nclasses Nviews Nstyles, in
clutter. Lots of variability within classes, and
across viewpoints.
7
Standard approach for multiclass object detection
(vision community)
Using a set of independent binary classifiers is
the dominant strategy
8
Characteristics of one-vs-all multiclass
approaches cost
Computational cost grows linearly with Nclasses
Nviews Nstyles Surely, this will not scale
well to 30,000 object classes.
9
Characteristics of one-vs-all approaches
representation
Part-based object representation (looking for
meaningful parts)
  • A. Agarwal and D. Roth
  • M. Weber, M. Welling and P. Perona


Ullman, Vidal-Naquet, and Sali, 2004 features
of intermediate complexity are most informative
for (single-object) classification.
10
Other vision-related approaches
11
Multi-class classifiers (machine learning
community)
  • Error correcting output codes (Dietterich
    Bakiri, 1995 )
  • But only use classification decisions (1/-1), not
    real values.
  • Reducing multi-class to binary (Allwein et al,
    2000)
  • Showed that the best code matrix is
    problem-dependent dont address how to design
    code matrix.
  • Bunching algorithm (Dekel and Singer, 2002)
  • Also learns code matrix and classifies, but more
    complicated than our algorithm and not applied to
    object detection.
  • Multitask learning (Caruana, 1997 )
  • Train tasks in parallel to improve
    generalization, share features. But not applied
    to object detection, nor in a boosting framework.

12
Our approach
  • Share features across objects, automatically
    selecting the best sharing pattern.
  • Benefits of shared features
  • Efficiency
  • Sharing computations across classes
  • Accuracy
  • Generalization ability
  • Sharing generic knowledge about detecting objects
    (eg, from the background).

13
Independent features
Object class 1
Total number of hyperplanes (features) 4 x 6
24. Scales linearly with number of classes
14
Shared features
Total number of shared hyperplanes (features) 8
May scale sub-linearly with number of classes,
and may generalize better.
15
Aside the sharing structure is a graph, not a
tree
Features 3, 4
Features 1, 2
Features 5, 6
Features 7, 8
Sharing graph for the 8 features
16
At the algorithmic level
  • Our approach is a variation on boosting that
    allows for sharing features in a natural way.
  • So lets review boosting (ada-boost demo)

17
Boosting demo
application to vision
18
Additive models for classification
19
Feature sharing in additive models
  • Simple to have sharing between additive models
  • Each term hm can be mapped to a single feature

G1
G1,2
G2
20
Flavors of boosting
  • Different boosting algorithms use different loss
  • functions or minimization procedures
  • (Freund Shapire, 1995 Friedman, Hastie,
    Tibshhirani, 1998).
  • We base our approach on Gentle boosting learns
    faster than others
  • (Friedman, Hastie, Tibshhirani, 1998
  • Lienahart, Kuranov, Pisarevsky, 2003).

21
Multi-class Boosting
We use the exponential multi-class cost function
classes
classifier output for class c
membership in class c, 1/-1
cost function
22
Weak learners are shared
At each boosting round, we add a perturbation or
weak learner which is shared across some
classes
23
Use Newtons method to select weak learners
Treat hm as a perturbation, and expand loss J to
second order in hm
classifier with perturbation
cost function
squared error
reweighting
24
Multi-class Boosting
weight
squared error
Weight squared error over training data
25
Specialize weak learners to decision stumps
hm (v,c)
Feature output, v
26
Find weak learner parameters analytically
hm (v,c)
Feature output, v
27
Joint Boosting select sharing pattern and weak
learner to minimize cost.
Conceptually, for all features for all class
sharing patterns find the optimal decision
stump, hm(v,c) end end select the hm(v,c) and
sharing pattern that minimizes the weighted
squared error Jwse for this boosting round.
28
Example selected weak learner, hm(v,c)
object 1
object 2
object 3
object 4
object 5
Algorithm details in CVPR 2004, Torralba, Murphy
Freeman
29
Approximate best sharing
To avoid exploring all 2C 1 possible sharing
patterns, use best-first search S Grow a
list of candidate sharing patterns, S. while
length S lt Nc for each object class, ci, not in
S consider adding ci to the list of shared
classes, S for all features, hm evaluate the
cost J of hm shared over S, ci end end S
S, cmin_cost end Pick the sharing pattern S and
feature hm which gave the minimum multi-class
cost J.
30
The heuristic for approximate best sharing works
well
C9 classes, D2 dimensions, synthetic data
31
Effect of pattern of feature sharing on number of
features required (synthetic example)
32
Effect of pattern of feature sharing on number of
features required (synthetic example)
(best first search heuristic)
33
Database of 2500 images
Annotated instances
34
Now, apply this to images.Image features (weak
learners)
32x32 training image of an object
35
The candidate features
position
template
36
The candidate features
position
template
Dictionary of 2000 candidate patches and position
masks, randomly sampled from the training images
37
Multiclass object detection
We use 20 - 50 training samples per object, and
about 20 times as many background examples as
object examples.
38
Feature sharing at each boosting round during
training
39
Feature sharing at each boosting round during
training
40
Example shared feature (weak classifier)
Response histograms for background (blue) and
class members (red)
At each round of running joint boosting on
training set we get a feature and a sharing
pattern.
41
How the features were shared across objects
(features sorted left-to-right from generic to
specific)
42
Performance evaluation
Correct detection rate
Area under ROC (shown is .9)
False alarm rate
43
Performance improvement over training
Significant benefit to sharing features using
joint boosting.
44
ROC curves for our 21 object database
  • How will this work under training-starved or
    feature-starved conditions?
  • Presumably, in the real world, we will always be
    starved for training data and for features.

45
70 features, 20 training examples (left)
46
15 features, 20 training examples (mid)
70 features, 20 training examples (left)
47
15 features, 2 training examples (right)
70 features, 20 training examples (left)
15 features, 20 training examples (middle)
48
Scaling
Joint Boosting shows sub-linear scaling of
features with objects (for area under ROC 0.9).
Results averaged over 8 training sets, and
different combinations of objects. Error bars
show variability.
49
Red shared features Blue independent
features
50
Red shared features Blue independent
features
51
What makes good features?
  • Depends on whether we are doing single-class or
    multi-class detection

52
Generic vs. specific features
53
Shared feature
Non-shared feature
54
Shared feature
Non-shared feature
55
Qualitative comparison of features, for
single-class and multi-class detectors
56
An application of feature sharing Object
clustering
Count number of common features between objects
57
Multi-view object detectiontrain for object and
orientation
Sharing features is a natural approach to
view-invariant object detection.
View invariant features
View specific features
58
Multi-view object detection
Sharing is not a tree. Depends also on 3D
symmetries.


59
Multi-view object detection
60
Multi-view object detection
Strong learner H response for car as function of
assumed view angle
61
Visual summary
62
Features
Object units
63
(No Transcript)
64
Summary
  • Feature sharing essential for scaling up object
    detection to many objects and viewpoints.
  • Joint boosting generalizes boosting.
  • Initial results (up to 30 objects) show the
    desired scaling behavior.
  • The shared features
  • generalize better,
  • allow learning from fewer examples,
  • with fewer features.
Write a Comment
User Comments (0)
About PowerShow.com