Title: Ivan Laptev
1Boosted Histograms for Improved Object Detection
Ivan Laptev IRISA/INRIA, Rennes, France September
07, 2006
2Histograms for object recognition
Remarkable success of recognition methods using
histograms of local image measurements
- Swain Ballard 1991 - Color histograms
- Schiele Crowley 1996 - Receptive field
histograms - Lowe 1999 - localized orientation histograms
(SIFT) - Schneiderman Kanade 2000 - localized
histograms of wavelet coef. - Leung Malik 2001 - Texton histograms
- Belongie et.al. 2002 - Shape context
- Dalal Triggs 2005 - Dense orientation
histograms
Likely explanation Histograms are robust to
image variations such as limited geometric
transformations and object class variability.
3Histograms What vs. Where
What to measure?
Histograms
Where to measure?
- No guarantee for optimal recognition
- Different regions may have different
discriminative power
4Idea
selected features
boosting
weak classifier
? ? ?
- Efficient discriminative classifier
FreundSchapire97 - Good performance for face detection
ViolaJones01
AdaBoost
Haar features
SVM Neural Networks
Histogram features
Too heavy
5Weak learner
Possible approach
1-dim. projections onto predefined vectors
Example 1
6Weak learner
Possible approach
1-dim. projections onto predefined vectors
Example 2
7Fischer weak learner
Alternative approach
- Assume Normal distribution of features
(hopefully valid at least for some of 105
features!) - Compute projection direction by FLD
- Can be modified to minimize the error of
weighted samples (required for boosting)
feature mean feature covariance
Fischer learner
1-bin learner
Evidence from real image training data
8Histogram features
9Training data
10Training Selected Features
0.999 correct classification 10-5 false positives
376 of 105 features selected
11Object detection
- Scan and classify image windows at different
positions and scales
- Cluster detections in the space-scale space
- Assign cluster size to the detection confidence
12PASCAL Visual Object Classes Challenge 2005
(VOC05)
217 / 220
motorbikes
bicycles
123 / 123
people
152 / 149
cars
320 / 341
13Evaluation criteria
Ground truth annotation
- Detection results
- gt50 overlap of bounding box with GT
- one bounding box for each object
- confidence value for each detection
Precision-Recall (PR) curve
Average Precision (AP) value
14Evaluation of detection
PR-curves for the Motorbike validation dataset
FLD learner
1-bin classifier
15Results for VOC05 Challenge
People test1
Bicycles test1
cars test1
Motorbikes test1
16Results for VOC05 Challenge
Average Precision values
17(No Transcript)
18(No Transcript)
19PASCAL Visual Object Classes Challenge 2006
(VOC06)
20Results for VOC06 Challenge
Competition "comp3" (train on VOC data) Class
bicycle"
examples
21Results for VOC06 Challenge
Competition "comp3" (train on VOC data) Class
cow"
examples
22Results for VOC06 Challenge
Competition "comp3" (train on VOC data) Class
horse"
examples
23Results for VOC06 Challenge
Competition "comp3" (train on VOC data) Class
motorbike"
24Results for VOC06 Challenge
Competition "comp3" (train on VOC data) Class
person"
25Results for VOC06 Challenge
Average Precision values
26Final Notes
- All results are obtained with a single set of
parameters - Small number of training samples is sufficient
- Efficient detection 10fps on 320x280 images
- Extension to texton/color histogram features is
straightforward
Open questions
- Other free-shape regions better? How to find
them? - Better weak learner that takes advantage of
histogram properties - View transformations
27Final Notes
- All results are obtained with a single set of
parameters - Small number of training samples is sufficient
- Efficient detection 10fps on 320x280 images
- Extension to texton/color histogram features is
straightforward
Open questions
- Other free-shape regions better? How to find
them? - Better weak learner that takes advantage of
histogram properties - View transformations
28Final Notes
- All results are obtained with a single set of
parameters - Small number of training samples is sufficient
- Efficient detection 10fps on 320x280 images
- Extension to texton/color histogram features is
straightforward
Open questions
- Other free-shape regions better? How to find
them? - Better weak learner that takes advantage of
histogram properties - View transformations
29Final Notes
- All results are obtained with a single set of
parameters - Small number of training samples is sufficient
- Efficient detection 10fps on 320x280 images
- Extension to texton/color histogram features is
straightforward
Open questions
- Other free-shape regions better? How to find
them? - Better weak learner that takes advantage of
histogram properties - View transformations
30Final Notes
- All results are obtained with a single set of
parameters - Small number of training samples is sufficient
- Efficient detection 10fps on 320x280 images
- Extension to texton/color histogram features is
straightforward
Open questions
- Other free-shape regions better? How to find
them? - Better weak learner that takes advantage of
histogram properties - View transformations
- Detection tasks in VOC05,VOC06 are far from
being solved, it is a challenge!
31(No Transcript)