Title: Shapebased features for localization
1Shape-based features for localization
- Classes with characteristic shape
- appearance patches are not adapted
- shape-based descriptors are necessary
Shotton et al, ICCV 2005 Opelt et al. ECCV
2006 Leordeanu et al. CVPR 2007
2Shape features
- Classes with characteristic shape
- appearance patches are not adapted
- shape-based descriptors are necessary
- Our approach pairs of adjacent segments
Ferrari, Fevrier, Jurie Schmid, PAMI08
3Pairs of adjacent segments (PAS)
- Contour segment network
- Ferrari et al. ECCV06
- Edgels extracted with Berkeley boundary detector
- Edgel-chains partitioned into straight contour
segments - Segments connected at edgel-chains endpoints and
junctions
4Pairs of adjacent segments (PAS)
Contour segment network
PAS groups of two connected segments
PAS descriptor
- encodes geometric properties of the PAS
- scale and translation invariant
- compact, 5D
5Features pairs of adjacent segments (PAS)
Example PAS
Why PAS ?
intermediate complexity good
repeatability-informativeness trade-off
scale-translation invariant
connected natural grouping criterion (need not
choose a grouping neighborhood or scale)
6PAS codebook
PAS descriptors are clustered into a vocabulary
a few types from 15 indoor images
- Frequently occurring PAS have intuitive, natural
shapes - As we add images, number of PAS types converges
to just 100 - Very similar codebooks come out, regardless of
source images - ? general, simple features
7Window descriptor
1. Subdivide window into tiles 2. Compute a
separate bag of PAS per tile 3. Concatenate
these semi-local bags distinctive
records which PAS appear where weight PAS
by average edge strength flexible
soft-assign PAS to types, coarse tiling
fast computation with Integral Histograms
8Training
1. Learn mean positive window dimensions 2.
Determine number of tiles T 3. Collect positive
example descriptors
4. Collect negative example descriptors
slide window over negative
training images
9Training
5. Train a linear SVM from positive and negative
window descriptors
A few of the highest weighed descriptor vector
dimensions ( 'PAS tile')
lie on object boundary ( local shape
structures common to many training exemplars)
10Testing
1. Slide window of aspect ratio
at multiple scales
2. SVM classify each window non-maxima
suppression detections
11Experimental results INRIA horses
Dataset 170 positive 170 negative images
(training 50 pos 50 neg) wide
range of scales clutter
tiling brings a substantial improvement
optimum at T30 ? used for all other
experiments
works well 86 det-rate at 0.3 FPPI (50 pos
50 neg training images)
12Experimental results INRIA horses
Dataset 170 positive 170 negative images
(training 50 pos 50 neg) wide
range of scales clutter
PAS better than any interest point detector
- all interest point (IP) comparisons with T10,
and 120 feature types ( optimum over INRIA
horses, and ETHZ Shape Classes) - IP codebooks
are class-specific
13Results ETH shape classes
Dataset 255 images, 5 classes large scale
changes, clutter training half of
positive images for a class
same number from the other classes (1/4
from each) testing all other
images
14Results ETH shape classes
Dataset 255 images, 5 classes large scale
changes, clutter training half of
positive images for a class
same number from the other classes (1/4
from each) testing all other
images
15Results ETHZ Shape Classes
Apple logos
Bottles
mean det-rate at 0.4 FPPI 79
class specific IP codebooks
PAS gtgt I.P for apple logos, bottles, mugs
PAS IP for giraffes (texture!)
PAS lt IP for swan
overall best IP Harris-Laplace
Giraffes
Mugs
Swans
16Results Caltech 101
Dataset Fei-Fei et al., GMBV 2004
images 42 anchor, 62 chair, 57 cup lots of
background images training half
of positive images for a class
same number of background images
testing other half pos same number
of background scale changes only
little clutter
17Results Caltech 101
Dataset Fei-Fei et al., GMBV 2004
PAS much better than Harris-Laplace mean
det-rate at 0.4 FPPI 85
18Comparison to HOG Dalal Triggs, CVPR05
19Comparison to HOG Dalal Triggs, CVPR05
overall mean det-rate at 0.4 FPPI PAS 82 gtgt
HoG 58
PAS gtgt HoG for 6 datasets PAS HoG for 2
datasets PAS lt HoG for 2 datasets
20Generalizing PAS to kAS
kAS any path of length k through the contour
segment network
3AS
4AS
segment network
- scaletranslation invariant descriptor with
dimensionality 4k-2 - k feature complexity higher k more
informative, but less repeatable - overall mean det-rates ()
- 1AS PAS 3AS
4AS - 0.3 FPPI 69 77 64
57 - 0.4 FPPI 76 82 70
64
PAS do best !
21Discussion
- Summary
- Excellent results for shape-based classes
- Shape features outperform interest points for
these classes - PAS have best intermediate complexity among k-AS
- Extensions
- Less rigid matching
- Flexible shape matching
22Results Weizmann horses
Dataset 327 pos., 327 neg. images (training 50
pos, 50 neg) Shotton et al. 05
no scale changes modest clutter
Shottons EER
- exact comparison to Shotton use their images,
search at a single scale - PAS same performance
(92 precision-recall EER), but no need
for segmented training images (only
bounding-boxes) can detect objects at
multiple scales (see other experiments)
23Comparison to HOG Dalal Triggs, CVPR05
Bottles
INRIA horses