Title: Towards Sublinear Time Multiclass Object Detection
1Towards Sublinear TimeMulticlass Object Detection
2The Challenge
- Recognize objects in images
- Many object classes
- Many 3D views
- Feasible on consumer hardware
3Applications
- Cars that drive themselves
- Other robots
- Assistive devices for the blind
4This Talk
- Use an existing object representation Crandall
05 - Propose a faster detection algorithm
- equivalent accuracy
- Present initial experiments that suggest
- It scales well with classes x views
- Empirically sublinear
5Talk Overview
- Past Work
- Part-based detection
- 1-Fan/Star Model
- Proposed Algorithm
- Results
- Next Steps
- Feature Sharing
6Past Work State of the Art
- Part-based
- Shape
- Appearance
- Relatively high accuracy
- (for this presentation, assume good enough)
- Mostly single view, single class
- Linear running time in C (classes x views)
- (or parallelize with N processors -- !)
- Multiclass part sharing Torralba 2004
- Improve running time empirically O(log C)
- Restricted shape model
7Past Work Part-Based Detection
- Rigid pieces held together by springs.
- The springs joining the rigid pieces
- Constrain relative movement
- Measure the cost of the movement
- Cost of an embedding
- Measure the tension on each spring, and
- A local evaluation of how well each coherent
piece is embedded
Fischler, Elschlager 1973
8Past Work Part-Based Detection
- Global measurement (shape)
- Constellation / arrangement of part positions
- Spring stretching / compressing
- Cost / energy associated with relative positions
of pairs of parts - Local measurement (appearance)
- Rigid local part from image information
- Independently measured for each part
9Past Work Part-Based Detection
- Find best location of all the parts (highest sum
of weighted votes) - minimize spring tension and part matching
energies - MAP estimation maximum probability of part
locations for a test image
10Past Work 1-Fan/Star Model
- Restrict all parts to only be connected to the
center part
11Past Work 1-Fan/Star Model
- Restrict all parts to only be connected to the
center part - More efficient detection (dynamic programming)
- Shown to be reasonably accurate Crandall 2005,
Fergus 2005
12Past Work 1-Fan/Star Model
- Hough Transform
- Each part votes for location of the center part
- Votes are weighted according to spring definitions
13Past Work 1-Fan/Star Model
Use Gaussians for shape models Crandall 2005,
Fergus 2005
14Past Work 1-Fan/Star Model
O(N)
O(N)
O(N)
O(N) O(N2)
x O(P) ? O(PN)
O(PN) (sum) O(N) (max) O(PN)
x O(C) ? O(CPN)
N pixels P parts C classes x views
15An Idea
16Proposed Algorithm
- Idea
- Run max, sum, distance transform computations all
together - Adaptively
- Divide into image pyramids
17Proposed Algorithm
- Key observation
- We can quickly calculate an upper bound of the
distance transform in a desired image pyramid
cell - Then refine in the most promising areas
18Proposed Algorithm
- Start with a coarse approximation
- Ignore shape information all together
- Think largest cell in the image pyramid groups
all pixels into one - Equivalent to bag-of-words (0-fan)
19Proposed Algorithm
- For the object that looks most promising, descend
down to a finer resolution in the hierarchy, and
re-estimate the distance transform. - Based on a hierarchical A framework Macallester
07 - Admissible heuristic based on upper bound
estimate for coarse estimates
20???
21???
22(No Transcript)
23max
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30max
31Results Match Time
32Results Total Time
33Next Steps
- Recall
- Appearance correlation is still O(PC)
- P parts, C classes x views
- Even if shape matching is sublinear, we still
have O(PC) o(C) O(PC) - Need to make correlation sublinear as well.
34Past Work Feature Sharing
Torralba 2004
35Past Work Feature Sharing
empirically O(log(C))
36Next Steps
- Combine
- Sublinear appearance correlation (via feature
sharing) with - Sublinear shape searching (described here)
- We get
- o(C) o(C) o(C)