Title: Visual Object Recognition
1Visual Object Recognition
- Bastian Leibe
- Computer Vision Laboratory
- ETH Zurich
- Chicago, 14.07.2008
Kristen Grauman Department of Computer
Sciences University of Texas in Austin
2Outline
- Detection with Global Appearance Sliding
Windows - Local Invariant Features Detection Description
- Specific Object Recognition with Local Features
- ? Coffee Break ?
- Visual Words Indexing, Bags of Words
Categorization - Matching Local Features
- Part-Based Models for Categorization
- Current Challenges and Research Directions
- Highlight of some research topics not covered in
the main tutorial
2
K. Grauman, B. Leibe
3Benchmark Data
- What degree of difficulty do current datasets
have?
4Example Caltech-101
A dataset that has been about mastered
Images from the Caltech-101 101-way multi-class
classification problem
K. Grauman, B. Leibe
5Example Caltech256
Images from the Caltech-256 256 multi-class
recognition problem
K. Grauman, B. Leibe
6Example Pascal Visual Object Classes Challenge
Pascal VOC 2007 Binary detection problems
http//pascallin.ecs.soton.ac.uk/challenges/VOC/
K. Grauman, B. Leibe
7Example LabelMe
http//labelme.csail.mit.edu/
K. Grauman, B. Leibe
8Current challenges ongoing research
- Multi-cue integration
- Finer level categorization
- View invariant recognition
- Unsupervised category discovery
- Learning from noisily labeled images
- Integration of segmentation and recognition
- Learning with text and images/video
- Use of video
- Context and scene layout
9Multi-cue integration
- Single cues often not sufficient.
- Integrate multiple local and global cues.
10Multi-Category Discrimination
- Distinguish similar categories.
- Need to look at specific details!
10
K. Grauman, B. Leibe
11Multi-Aspect Recognition
- Detectors for different viewpoints ? How can this
be improved?
11
K. Grauman, B. Leibe
12Multi-Aspect Recognition
Hoiem, Rother, Winn, CVPR07
13Multi-Aspect Recognition
Rothganger et al., CVPR03
Savarese Fei-Fei, ICCV07
13
K. Grauman, B. Leibe
14Unsupervised, semi-supervised category discovery
Topic models for images
Latent Dirichlet Allocation (LDA)
z
c
?
w
N
D
Sivic et al. ICCV 2005, Fei-Fei et al. ICCV 2005
Figure credit Fei-Fei Li
15Unsupervised, semi-supervised category discovery
Clustering cluttered images Learning from noisy
keyword-based image search results
Grauman Darrell, CVPR 2006
Fergus et al. ECCV 2004, ICCV 2005
Li Fei-Fei, CVPR 2007
16Learning with text and images/video
Barnard et al. JMLR 2003
Berg, Berg, Edwards, Forsyth, NIPS 2006
Gupta et al. ECML 2008
17Integrating segmentation recognition
Borenstein Ullman, ECCV 2002
Kumar et al. CVPR 2005
Kannan, Winn, Rother, NIPS 2006
Tu, Chen, Yuille, Zhu, ICCV 2003
18Role of context, understanding scene layout
Antonio Torralba, IJCV 2003
19Role of context, understanding scene layout
Image
World
Hoiem, Efros, Hebert, CVPR 2006
20Integration with Scene Geometry
- Goal Find the ground plane
- Restrict object location
- Assume Gaussian size prior
- ? Significantly reduced search space
Search corridor
Hough Volume
21Extensions
- Combination with 3D Geometry
- Mobile Pedestrian Detection
Leibe, Cornelis, Cornelis, Van Gool, CVPR07
Ess, Leibe, Van Gool, ICCV07
21
22Detections Using Ground Plane Constraints
left camera 1175 frames
Leibe et al. CVPR07
23Extensions Tracking-by-Detection
- Spacetime trajectory analysis
- Link up detections to form physically plausible
ST trajectories - Select set of ST trajectories that best explain
the data
Leibe et al. CVPR07
24Dynamic Scene Analysis Results
Leibe et al. CVPR07
25Extensions (2)
- Combination 3D Reconstruction
Cornelis, Leibe, Cornelis, Van Gool, 3DPVT06
26Textured 3D Model
- Run-times
- SfM Bundle adjustment 27-30 fps on CPU
- Dense reconstruction 36 fps on GPU
Cornelis, Cornelis, Van Gool, CVPR06
27Improved 3D City Model
- Enhancing your driving experience
Cornelis, Leibe, Cornelis, Van Gool, 3DPVT06
28Putting It All Together
29Mobile Pedestrian Tracking
Ess, Leibe, Schindler, Van Gool, CVPR08
30Mobile Tracking Through Crowds
Ess, Leibe, Schindler, Van Gool, CVPR08
31Extension Recovering Articulations
1...N
- Idea Only perform articulated tracking where
its easy! - Multi-person tracking
- Solves hard data association problem
- Articulated tracking
- Only on individual tracklets between occlusions
Gammeter, Ess, Jaeggli, Schindler, Leibe, Van
Gool, ECCV08
32Articulated Multi-Person Tracking
- Multi-Person tracking
- Recovers trajectories and solves data association
- Estimates 3D walking direction and speed
- Detects occlusion events
Gammeter, Ess, Jaeggli, Schindler, Leibe, Van
Gool, ECCV08
33Articulated Tracking under Egomotion
Gammeter, Ess, Jaeggli, Schindler, Leibe, Van
Gool, ECCV08
34(No Transcript)
35Summary
- Visual recognition is a challenging and very
active research area. - Weve covered some basic models and
representations that have been shown to be
effective, and highlighted some ongoing issues. - See tutorial website for slides, links,
references. - http//www.vision.ee.ethz.ch/bleibe/teaching/tuto
rial-aaai08/ - Thank you!
K. Grauman, B. Leibe