Title: Object Recognition with Informative Features and Linear Classification
1Object Recognition with Informative Features and
Linear Classification
- Authors Vidal-Naquet Ullman
- Presenter David Bradley
2Vs.
- Image fragments make good features
- especially when training data is limited
- Image fragments contain more information than
wavelets - allows for simpler classifiers
- Information theory framework for feature selection
3Intermediate complexity
4Whats in a feature?
- You and your favorite learning algorithm settle
down for a nice game of 20 questions - Except since it is a learning algorithm it cant
talk, and the game really becomes 20 answers - Have you asked the right questions?
- What information are you really giving it?
- How easy will it be for it to say Aha, you are
thinking of the side view of a car!
10110010110000111001
5Pseudo-Inverse
- In general image reconstruction from features
provides a good intuition of what information
they are providing
6Wavelet coefficients
- Asks the question how much is the current block
of pixels like my wavelet pattern? - This set of wavelets can entirely represent a 2x2
pixel block - So if you give your learning algorithm all of the
wavelet coefficients then you have given it all
of the information it could possibly need, right?
7Performed best in their test with cars
8Sometimes wavelets work well
- Viola and Jones Face Detector
- Trained on 24x24 pixel windows
- Cascade Structure (32 classifiers total)
- Initial 2-feature classifier rejects 60 of
non-faces - Second, 5-feature classifier rejects 80 of
non-faces
9But they can require a lot of training data to
use correctly
- Rest of the Viola and Jones Face Detector
- 3 20-feature classifiers
- 2 50-feature classifiers
- 20 200-feature classifiers
- In the later stages it is tough to learn what
combinations of wavelet questions to ask. - Surely there must be an easier way
10Image fragments
- Represent the opposite extreme
- Wavelets are basic image building blocks.
- Fragments are highly specific to the patterns
they come from - Present in the image if cross-correlation gt
threshold - Ideally if one could label all possible images
(and search them quickly) - Use whole images as fragments
- All vision problems become easy
- Just look for the match
11Dealing with the non-ideal world
- Want to find fragments that
- Generalize well
- Are specific to the class
- Add information that other fragments havent
already given us. - What metric should we use to find the best
fragments?
12Information Theory Review
- Entropy the minimum of bits required to encode
a signal
Shannon Entropy
Conditional Entropy
13Mutual Information
Entropy
Conditional Entropy
Class
- I(C, F) H(C) H(CF)
- High mutual information means that knowing the
feature value reduces the number of bits needed
to encode the class
Feature
14Picking features with Mutual Information
- Not practical to exhaustively search for the
combination of features with the highest mutual
information. - Instead do a greedy search for the feature whose
minimum pair-wise information gain with the
feature set already chosen is the highest.
15Picking features with Mutual Information
X2
X1
Low pair-wise information gain indicates
variables are dependent
Pick the most pair-wise independent variable
X3
X4
16Features picked for cars
17(No Transcript)
18The Details
- Image Database
- 573 14x21 pixel car side-view images
- Cars occupied approx 10x15 pixels
- 461 14x21 pixel non-car images
- 4 classifiers were trained for 20
cross-validation iterations to generate results - 200 car and 200 non-car images in the training
set - 100 car images to extract fragments from
19(No Transcript)
20Features
- Extracted 59200 fragments from the first 100
images - 4x4 to 10x14 pixel image patches
- Taken from the 10x15 pixel region containing the
car. - Location restricted to a 5x5 area around original
location - Used 2 scales of wavelets from the 10x15 region
- Selected 168 features total
21Classifiers
- Linear SVM
- Tree Augmented Network (TAN)
- Models features class dependency and biggest
pairwise feature dependency - Quadratic decision surface in feature space
22(No Transcript)
23Occasional information loss due to overfitting
24(No Transcript)
25More Information About Fragments
- Torralba et al. Sharing Visual Features for
Multiclass and Multiview Object Detection. CVPR
2004. - http//web.mit.edu/torralba/www/extendedCVPR2004.p
df - ICCV Short Course (great matlab demo)
- http//people.csail.mit.edu/torralba/iccv2005/
26Objections
- Wavelet features chosen are very weak
- Images were very low resolution, maybe too
low-res for more complicated wavelets - Data set is too easy
- Side-views of cars have low intra-class
variability - Cars and faces have very stable and predictable
appearances - not hard enough to stress the fragment linear
SVM classifier, so TAN shows no improvement. - Didnt compare fragments against successful
wavelet application - Schneiderman Kanade car detector
- Do the fragment-based classifiers effectively get
100 more training images?