Visual Object Recognition - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Visual Object Recognition

Description:

Visual Object Recognition – PowerPoint PPT presentation

Number of Views:35

Avg rating:3.0/5.0

Slides: 42

Provided by: bastia

Category:

more less

Transcript and Presenter's Notes

Title: Visual Object Recognition

1
Visual Object Recognition

Bastian Leibe
Computer Vision Laboratory
ETH Zurich
Chicago, 14.07.2008

Kristen Grauman Department of Computer
Sciences University of Texas in Austin
2
Outline

Detection with Global Appearance Sliding
Windows
Local Invariant Features Detection Description
Specific Object Recognition with Local Features
? Coffee Break ?
Visual Words Indexing, Bags of Words
Categorization
Matching Local Features
Part-Based Models for Categorization
Current Challenges and Research Directions

2
K. Grauman, B. Leibe
3
Recognition of Object Categories

We no longer have exact correspondences
On a local level, wecan still detect similar
parts.
Represent objectsby their parts
? Bag-of-features
How can weimprove on this?
Encode structure

Slide credit Rob Fergus
4
Part-Based Models

Fischler Elschlager 1973
Model has two components
parts (2D image fragments)
structure (configuration of parts)

5
Different Connectivity Structures
O(N6)
O(N2)
O(N3)
O(N2)
Fergus et al. 03 Fei-Fei et al. 03
Leibe et al. 04, 08Crandall et al. 05 Fergus
et al. 05
Crandall et al. 05
Felzenszwalb Huttenlocher 05
Bouchard Triggs 05
Carneiro Lowe 06
Csurka 04 Vasconcelos 00
from Carneiro Lowe, ECCV06
6
Spatial Models Considered Here
Star shape model
Fully connected shape model

e.g. Constellation Model
Parts fully connected
Recognition complexity O(NP)
Method Exhaustive search

e.g. ISM
Parts mutually independent
Recognition complexity O(NP)
Method Gen. Hough Transform

Slide credit Rob Fergus
7
Constellation Model

Joint model for appearance and shape

8
Constellation Model
9
Constellation Model Learning Procedure

Goal Find regions their location, scale
appearance
Initialize model parameters
Use EM and iterate to convergence
E-step Compute assignments for which regions are
foreground/background
M-step Update model parameters
Trying to maximize likelihood consistency in
shape appearance

10
Example Motorbikes
11
Example Motorbikes (2)
12
Example Spotted Cats
13
Discussion Constellation Model

Advantages
Works well for many different object categories
Can adapt well to categories where
Shape is more important
Appearance is more important
Everything is learned from training data
Weakly-supervised training possible
Disadvantages
Model contains many parameters that need to be
estimated
Cost increases exponentially with increasing
number of parameters
? Fully connected model restricted to small
number of parts.

14
Implicit Shape Model (ISM)

Basic ideas
Learn an appearance codebook
Learn a star-topology structural model
Features are considered independent given obj.
center
Algorithm probabilistic Gen. Hough Transform
Exact correspondences ? Prob. match to object
part
NN matching ? Soft matching
Feature location on obj. ? Part location
distribution
Uniform votes ? Probabilistic vote weighting
Quantized Hough array ? Continuous Hough space

15
Codebook Representation

Extraction of local object features
Interest Points (e.g. Harris detector)
Sparse representation of the object appearance
Collect features from whole training set
Example

16
Agglomerative Clustering

Algorithm (Average-Link)
Start with each patch as a cluster of its own
Repeat
Merge the two most similar clusters X and Y,
where the similarity between two clusters is
defined as the average similarity between their
members
Until
Commonly used similarity measures
Normalized correlation
Euclidean distances

17
Appearance Codebook

Clustering Results
Visual similarity preserved
Wheel parts, window corners, fenders, ...
Store cluster centers as Appearance Codebook

18
Gen. Hough Transform with Local Features

For every feature, store possible occurrences

For new image, let the matched features vote for
possible object positions

Object identity
Pose
Relative position

19
Implicit Shape Model - Representation

Learn appearance codebook
Extract local features at interest points
Agglomerative clustering ? codebook
Learn spatial distributions
Match codebook to training images
Record matching positions on object

local figure-ground labels
20
Leibe04, Leibe08
21
Implicit Shape Model - Recognition
Interest Points
Leibe04, Leibe08
22
Implicit Shape Model - Recognition
Interest Points
Leibe04, Leibe08
23
Leibe04, Leibe08
24
Example Results on Cows
25
Example Results on Cows
26
Example Results on Cows
27
Example Results on Cows
28
Example Results on Cows
1st hypothesis
29
Example Results on Cows
2nd hypothesis
30
Example Results on Cows
3rd hypothesis
31
Scale Invariant Voting

Scale-invariant feature selection
Scale-invariant interest points
Rescale extracted patches
Match to constant-size codebook
Generate scale votes
Scale as 3rd dimension in voting space
Search for maxima in 3D voting space

32
Scale Voting Adaptive Search Window

Voting equations
? Relative error, proportional to hypothesis
scale
? Vote density decreases with increasing scale
Adapt search window
Increase size with hypothesis scale
Intuitive interpretation detection tolerance

33
Scale Voting Efficient Computation

Mean-Shift formulation for refinement
Scale-adaptive balloon density estimator

Scale votes
34
Leibe04, Leibe08
35
Detection Results

Qualitative Performance
Recognizes different kinds of objects
Robust to clutter, occlusion, noise, low contrast

36
Figure-Ground Segregation

Problem extensively studied in Psychophysics
Experiments with ambiguousfigure-ground stimuli
Results
Evidence that object recognition canand does
operate before figure-ground organization
Interpreted as Gestalt cue familiarity.

M.A. Peterson, Object Recognition Processes Can
and Do Operate Before Figure-Ground
Organization, Cur. Dir. in Psych. Sc.,
3105-111, 1994.
37
ISM Top-Down Segmentation
Leibe04, Leibe08
38
Segmentation Probabilistic Formulation

Influence of patch on object hypothesis (vote
weight)

Backprojection to features f and pixels p

Leibe04, Leibe08
39
Segmentation Probabilistic Formulation

Hypothesis generation
Segmentation

Leibe04, Leibe08
40
Derivation Top-down segmentation

Hypothesis generation

Leibe04, Leibe08
41
Derivation Top-down Segmentation

Hypothesis generation

Leibe04, Leibe08
42
Derivation Top-down Segmentation

Hypothesis generation
Segmentation

Leibe04, Leibe08
43
Derivation Top-down Segmentation

Hypothesis generation
Segmentation

Leibe04, Leibe08
44
Derivation Top-down Segmentation

Hypothesis generation
Segmentation

Leibe04, Leibe08
45
Leibe04, Leibe08
46
Segmentation

Interpretation of p(figure) map
per-pixel confidence in object hypothesis
Use for hypothesis verification

Leibe04, Leibe08
47
Example Results Motorbikes
48
Example Results Cows

Training
112 hand-segmented images
Results on novel sequences

Single-frame recognition - No temporal continuity
used!
Leibe04, Leibe08
49
Example Results Chairs
Dining room chairs
Office chairs
50
Inferring Other Information Part Labels
Thomas07
51
Inferring Other Information Part Labels (2)
Thomas07
52
Inferring Other Information Depth Maps
Depth from a single image
Thomas07
53
Application for Pedestrian Detection