Title: Cue Integration in FigureGround Labeling
1Cue Integration in Figure/Ground Labeling
- Xiaofeng Ren, Charless Fowlkes and Jitendra Malik
2Abstract
- We present a model of edge and region grouping
using a conditional random field built over a
scale-invariant representation of images to
integrate multiple cues. Our model includes
potentials that capture low-level similarity,
mid-level curvilinear continuity and high-level
object shape. Maximum likelihood parameters for
the model are learned from human labeled
groundtruth on a large collection of horse images
using belief propagation. Using held out test
data, we quantify the information gained by
incorporating generic mid-level cues and
high-level shape.
3Introduction
Conditional Random Fields on triangulated images,
trained to integrate low/mid/high-level grouping
cues
CRF
4Inference on the CDT Graph
Z
Contour variables Xe
Region variables Yt
Object variable Z
Integrating Xe,Yt andZ
low/mid/high-level cues
5Grouping Cues
- Low-level Cues
- Edge energy along edge e
- Brightness/texture similarity between two regions
s and t - Mid-level Cues
- Edge collinearity and junction frequency at
vertex V - Consistency between edge e and two adjoining
regions s and t - High-level Cues
- Texture similarity of region t to exemplars
- Compatibility of region shape with pose
- Compatibility of local edge shape with pose
L1(XeI)
L2(Ys,YtI)
M1(XVI)
M2(Xe,Ys,Yt)
H1(YtI)
H2(Yt,ZI)
H3(Xe,ZI)
6Conditional Random Fields for Cue Integration
Estimate the marginal posteriors of X, Y and Z
7Encoding Object Knowledge
8H3(Xe,ZI) local shape and pose
Let S(x,y) be the shapeme at image location
(x,y) (xo,yo) be the object location in Z.
Compute average log likelihood SON(e,Z) as
shapeme i (horizontal line)
distribution ON(x,y,i)
SOFF(e,Z) is defined similarly.
Then we have
shapeme j (vertical pairs)
distribution ON(x,y,j)
9Training and Testing
- Trained on half (172) of the grayscale horse
images from the Borenstein Ullman 02 Horse
Dataset. - Use human-marked segmentations to construct
groundtruth labels on both CDT edges and
triangles. - Uses loopy belief propagation for approximate
inference takes lt 1 second to converge for a
typical image. - Parameter estimation with gradient descent for
maximum likelihood converges in 1000 iterations. - Tested on the other half of the horse images in
grayscale. - Quantitative evaluation against groundtruth
precision-recall curves for both contours and
regions.
10(No Transcript)
11(No Transcript)
12Results
Input
Input Pb
Output Contour
Output Figure
13Input
Input Pb
Output Contour
Output Figure
14Input
Input Pb
Output Contour
Output Figure
15Conclusion
- Constrained Delaunay Triangulation provides a
scale-invariant discrete structure which enables
efficient probabilistic inference. - Conditional random fields combine joint contour
and region grouping and can be efficiently
trained. - Mid-level cues are useful for figure/ground
labeling, even when powerful object-specific cues
are present.
16Thank You
17(No Transcript)