Title: The Beauty of Local Invariant Features
1The Beauty of Local Invariant Features
- Svetlana Lazebnik
- Beckman Institute, University of Illinois at
Urbana-Champaign
IMA Recognition Workshop University of
Minnesota May 22, 2006
2What are Local Invariant Features?
- Descriptors of image patches that are invariant
to certain classes of geometric and photometric
transformations
Lowe (2004)
3A Historical Perspective
4Feature Detection and Description
1. Detect regions
covariant detection
5Advantages
- Locality
- Robustness to clutter and occlusion
- Repeatability
- The same feature occurs in multiple images of the
same scene or class - Distinctiveness
- Salient appearance pattern that provides strong
matching constraints - Invariance
- Allow matching despite scale changes, rotations,
viewpoint changes - Sparseness
- Relatively few features per image, compact and
efficient representation - Flexibility
- Many existing types of detectors, descriptors
6Scale-Covariant Detectors
- Laplacian, Hessian, Difference-of-Gaussian
(blobs)Lindeberg (1998), Lowe (1999, 2004)
- Harris-Laplace (corners) Mikolajczyk Schmid
(2001)
7Scale-Covariant Detectors
- Salient (high entropy) regions Kadir Brady
(2001) - Circular edge-based regions Jurie Schmid (2003)
8Affine-Covariant Detectors
- Laplacian, Hessian-Affine (blobs) GÃ¥rding
Lindeberg (1996), Mikolajczyk et al. (2004)
- Harris-Affine (corners) Mikolajczyk Schmid
(2002)
9Affine-Covariant Detectors
- Edge- and intensity-based regions Tuytelaars
Van Gool (2004) - Maximally stable extremal regions (MSER) Matas
et al. (2002)
10Types of Descriptors
- Differential invariants Koenderink Van Doorn
(1987), Florack et al. (1991) - Filter banks complex, Gabor, steerable,
- Multidimensional histograms
11Applications (1)
- Wide-baseline matching and recognition of
specific objects
Tuytelaars Van Gool (2004)
Ferrari, Tuytelaars Van Gool (2005)
Rothganger, Lazebnik, Schmid Ponce (2005)
Lowe (2004)
12Applications (2)
- Category-level recognition based on geometric
correspondence
Lazebnik, Schmid Ponce (2004)
Berg, Berg Malik (2005)
13Applications (3)
- Learning parts and visual vocabularies
Constellation model
Fergus, Perona Zisserman (2003)Weber, Welling
Perona (2000)
14Applications (4)
- Building global image models invariant to a wide
range of deformations
Lazebnik, Schmid Ponce (2005)
15Comparative Evaluations
- Flat scenes Mikolajczyk Schmid (2004),
Mikolajczyk et al. (2004) - MSER and Hessian regions have the highest
repeatability - Harris and Hessian regions provide the most
correspondences - SIFT (GLOH, PCA-SIFT) descriptors have the
highest performance - 3D objects Moreels Perona (2006)
- Features on 3D objects are much more unstable
than on planar objects - All detectors and descriptors perform poorly for
viewpoint changes gt 30 - Hessian with SIFT or shape context perform best
16Comparative Evaluations
- Object classes Mikolajczyk, Liebe Schiele
(2005) - Hessian regions with GLOH perform best
- Salient regions work well for object classes
- Texture and object classes Zhang, Marszalek,
Lazebnik Schmid (2005) - Laplacian regions with SIFT perform best
- Combining multiple detectors and descriptors
improves performance - Scalerotation invariance is sufficient for most
datasets
17Sparse vs. Dense Features UIUC texture dataset
25 classes, 40 samples each
Lazebnik, Schmid Ponce (2005)
18Sparse vs. Dense Features UIUC texture dataset
Multi-class classification accuracy vs. training
set size
Invariant local features
SVM
Non-invariant dense patches
NN
Baseline(global features)
SVM
NN
- A system with intrinsically invariant features
can learn from fewer training examples
Zhang, Marszalek, Lazebnik Schmid (2005)
19Sparse vs. Dense Features CUReT dataset
Dana, van Ginneken, Nayar, and Koenderink (1999)
61 classes, 92 samples each, 43 training
Non-invariant features (SVM)
Non-invariant features (NN)
Invariant local features (SVM)
Baseline global features
Invariant local features (NN)
Relative Strengths
Sparse locally invariant features Dense non-invariant features
High-resolution images Low-resolution images
Non-homogeneous patterns Homogeneous, high-frequency patterns
Viewpoint changes Lighting changes
20Anticipating Criticism
- Existing local features are not ideal for
category-level recognition and scene
understanding - Designed for wide-baseline matching and specific
object recognition - Describe texture and albedo pattern, not shape
- Do not explain the whole image
- A little invariance goes a long way
- It is best to use features with the lowest level
of invariance required by a given task - Scalerotation is sufficient for most datasets
Zhang, Marszalek, Lazebnik Schmid (2005) - Denser sets of local features are more effective
- Hessian detector produces the most regions and
performs best in several evaluations - Regular grid of fixed-size patches is best for
scene category recognitionFei-Fei Perona (2005)
21Future Work
- Systematic evaluation of sparse vs. dense
features - Combining sparse and dense representations,
e.g., keypoints and segments Russell, Efros,
Sivic, Freeman Zisserman (2006) - Learning detectors and descriptors automatically
- Developing shape-based features