Title: Supervised Learning of Edges and Object Boundaries
1Supervised Learning ofEdges and Object
Boundaries
- Piotr Dollár
- Zhuowen Tu
- Serge Belongie
2The problem
?
3Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
4Outline
- I. Motivation
- Why edges?
- Why not edges?
- Why learning?
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
5Why edges?
- Reduce dimensionality of data
- Preserve content information
- Useful in applications such as
- object detection
- structure from motion
- tracking
6Why not edges?
- But, not that useful, why?
- Difficulties
- Modeling assumptions
- Parameters
- Multiple sources of information (brightness,
color, texture, ) - Real world conditions
- Is edge detection even well defined?
7Canny edge detection
Canny is optimal w.r.t. some model.
8Canny edge detection
1. smooth
2. gradient
3. thresh, suppress, link
And yet
9Canny difficulties
- Modeling assumptions
- Step edges, junctions, etc.
- Parameters
- Scales, threshold, etc.
- Multiple sources of information
- Only handles brightness
- Real world conditions
- Gaussian iid noise? Texture
10Modern methods
- Modeling assumptions
- Complex models, computationally prohibitive
- Parameters
- Many, may use learning to help tune
- Multiple sources of information
- Typically brightness, color, and texture cues
- Real world conditions
- Aimed at real images
11Modern methods (Pb)
Pb Martin et al. PAMI04
12Why learning?
- Modeling assumptions
- minimal
- Parameters
- none
- Multiple sources of information
- Automatically incorporated
- Real world conditions
- training data
13Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
14Problem formulation (general)
image scene interpretation that can include
spatial location and extent of objects, regions,
object boundaries, curves, etc. 0/1 function
that encodes spatial extent of a component of W
Obtaining optimal or likely W or SW can be
difficult. Let
We seek to learn this distribution directly from
image data. To further reduce complexity, we can
discard the absolute coordinates of S
where N(c) is the neighborhood of I centered at c.
15Problem formulation (edges)
image segmentation 1 on boundaries of segments,
0 elsewhere
16Discriminative framework
Goal is to learn from
human labeled images
Given an image I and n interpretations W obtained
by manual annotation, we can compute
Sample positive and negative patches according to
above
Finally train a classifier!
17Discriminative framework
Edge point present in center?
NO
YES
18Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
19Learning architecture
- Large training set O(108)
- but correlated
- very variable data
- Want generic, efficient features
- applicability to any domain
- fast computation essential
- Boosting a natural choice
20AdaBoost
Taken from tutorial by Jiri Matas and Jan Sochman
21Decision Stumps
(where f is some feature of x)
22AdaBoost (decision stumps)
. . .
23Cascaded classifiers
- Minimize computation during testing
- Especially useful for skewed prior
- Viola-Jones face/pedestrian detection
24Cascade (AdaBoost)
25Probabilistic boosting trees
- Expected amount of computation decreases
significantly - Once a mistake is made, it cannot be undone
- Cascade also made problem easier! Ideally,
splitting data creates two sub-problems each much
easier than original
26Probabilistic boosting trees
27Probabilistic boosting trees
- Retain efficiency of cascades
- Add power when necessary
- Prone to overfitting
- Tree was necessary to obtain good results.
28Haar features
- Feature response
- (image response to green squares)
- (image response to red squares)
- Applied to many views of the data
- grayscale, color, Gabor filter outputs, etc.
- at many orientations, locations, etc
- Fast computation using integral images
- Hundreds of thousands of candidate features
29Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
- Gestalt laws
- Natural images
- Road detection
- Object Boundaries
30Results
- Boosted edge learning (BEL)
- Compare to method with best known performance
(Pb), and also to Canny - Comparison not quite fair
Pb Martin et al. PAMI04
31Gestalt laws
- Gestalt laws of perceptual organization
- Symmetry, closure, parallelism, etc.
- Govern how component parts are organized into
overall patterns - The hard part of edge detection
- What can and cannot be achieved in our framework?
32Analogies
33Gestalt laws parallelism
34Gestalt laws modal completion
35Gestalt laws alternate interpretation
36Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
- Gestalt laws
- Natural images
- Road detection
- Object Boundaries
37Natural Images
- Berkeley Segmentation Dataset and Benchmark
- Standard dataset for edge detection with 300
manually annotated images - Modern benchmark for comparing edge detection
algorithms - Notes
- Edge detection in natural images is hard
- Possibly ill-defined problem
- Evil but necessary comparison
38Natural Images results
39Natural Images results
40Natural Images probabilities
human
BEL
Pb
image
41Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
- Gestalt laws
- Natural images
- Road detection
- Object Boundaries
42Road detection
location of roads in scene 1 if pixel is on the
road, 0 elsewhere
- Road detection is not edge detection
- But same learning architecture
- Ground truth obtained from map data
43Road detection (training)
(the 2 training images)
44Road detection (testing)
(the testing image)
(Winchester Dr. was not detected)
45Outline
- I. Motivation
- II. Problem formulation
- III. Learning architecture (BEL)
- IV. Results
- Gestalt laws
- Natural images
- Road detection
- Object Boundaries
46Object boundaries
location and extent of object of interest 1 on
boundaries of object, 0 elsewhere
- Must tune to specific type of edge
- Algorithms that model edges not applicable
- Potentially most useful application
47Object boundaries (context)
48Object boundaries (training)
49Object boundaries (ground truth)
50Object boundaries (Canny)
F-score .10
51Object boundaries (Pb)
F-score .13
52Object boundaries (BEL)
F-score .79
53Algorithm roundup
54Summary
- Define edges only in terms of labeled data,
minimal modeling assumptions - Minimize human effort in adapting algorithm to
particular domain
- Fast, affordable edge detection for the masses!
55Thank you!