Title: Visual Object Recognition
1Visual Object Recognition
- Rob Fergus
- Courant Institute, New York University
http//cs.nyu.edu/fergus/icml_tutorial/
2Agenda
- Introduction
- Bag-of-words models
- Visual words with spatial location
- Part-based models
- Discriminative methods
- Segmentation and recognition
- Recognition-based image retrieval
- Datasets Conclusions
3Recognizing and Learning Object Categories Year
2007
- Li Fei-Fei, Princeton
- Rob Fergus, NYU
- Antonio Torralba, MIT
http//people.csail.mit.edu/torralba/shortCourseRL
OC
4Agenda
- Introduction
- Bag-of-words models
- Visual words with spatial location
- Part-based models
- Discriminative methods
- Segmentation and recognition
- Recognition-based image retrieval
- Datasets Conclusions
5So what does object recognition involve?
6Classification are there street-lights in the
image?
7Detection localize the street-lights in the image
8Object categorization
mountain
tree
building
banner
street lamp
vendor
people
9Scene and context categorization
10Application Assisted driving
Pedestrian and car detection
Lane detection
- Collision warning systems with adaptive cruise
control, - Lane departure warning systems,
- Rear object detection systems,
11ApplicationComputational photography
12Application Improving online search
Query STREET
Organizing photo collections
13Challenges 1 view point variation
Michelangelo 1475-1564
14Challenges 2 scale
15Challenges 3 illumination
slide credit S. Ullman
16Challenges 4 background clutter
Bruegel, 1564
17Challenges 5 occlusion
http//lh5.ggpht.com/_wJc6t2hDl2M/RrL7Gh6sS7I/AAAA
AAAAAYY/n3xaHc2opls/DSC00633.JPG
18Challenges 6 deformation
http//img.timeinc.net/time/asia/magazine/2007/111
2/racehorse_1112.jpg
Xu, Beihong 1943
19History single object recognition
Object 1
Object 2
Object 3
20David Lowe 1985
Single object recognition history Geometric
methods
Rothwell et al. 1992
21Single object recognition history
Appearance-based methods
- Murase Nayer 1995
- Schmid Mohr 1997
- Lowe, et al. 1999, 2003
- Mahamud and Herbert, 2000
- Ferrari et al. 2004
- Rothganger et al. 2004
- Moreels and Perona, 2005
-
22Challenges 7 intra-class variation
Shoe class
Instance 1
Instance 2
Instance 3
23History early object categorization
24- Fischler, Elschlager, 1973
- Turk and Pentland, 1991
- Belhumeur, Hespanha, Kriegman, 1997
- Rowley Kanade, 1998
- Schneiderman Kanade 2004
- Viola and Jones, 2000
- Heisele et al., 2001
- Amit and Geman, 1999
- LeCun et al. 1998
- Belongie and Malik, 2002
- DeCoste and Scholkopf, 2002
- Simard et al. 2003
- Poggio et al. 1993
- Argawal and Roth, 2002
- Schneiderman Kanade, 2004
- ..
2510,000 to 30,000
26Three main issues
- Representation
- How to represent an object category
- Learning
- How to form the classifier, given training data
- Recognition
- How the classifier is to be used on novel data
27Representation
- Generative / discriminative / hybrid
28Representation
- Generative / discriminative / hybrid
- Appearance only or location and appearance
29Representation
- Generative / discriminative / hybrid
- Appearance only or location and appearance
- Invariances
- View point
- Illumination
- Occlusion
- Scale
- Deformation
- Clutter
- etc.
30Representation
- Generative / discriminative / hybrid
- Appearance only or location and appearance
- Invariances
- Part-based or
- global with sub-window
31Representation
- Generative / discriminative / hybrid
- Appearance only or location and appearance
- Invariances
- Parts or global w/sub-window
- Use set of features or each pixel in image
32Learning
- Unclear how to model categories, so learn rather
than manually specify
33Learning
- Unclear how to model categories, so learn rather
than manually specify - Methods of training generative vs. discriminative
34Learning
- Unclear how to model categories, so learn rather
than manually specify - Methods of training generative vs.
discriminative - Level of supervision
- Manual segmentation bounding box image labels
noisy labels
Contains a motorbike
35Learning
- Unclear how to model categories, so learn rather
than manually specify - Methods of training generative vs.
discriminative - Level of supervision
- Manual segmentation bounding box image labels
noisy labels - -- Training images
- Issue of over-fitting (typically limited training
data) - Negative images for discriminative methods
36Learning
- Unclear how to model categories, so learn rather
than manually specify - Methods of training generative vs.
discriminative - Level of supervision
- Manual segmentation bounding box image labels
noisy labels - -- Training images
- Issue of over-fitting (typically limited training
data) - Negative images for discriminative methods
- -- Priors
37Recognition
- Scale / orientation range to search over
- Speed
- Context
38Recognition
- Context enables pruning of detector output
Hoiem, Efros, Herbert, 2006