Title: Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction
1Joint Optimisation for Object ClassSegmentation
and Dense StereoReconstruction
- Lubor Ladický, Paul Sturgess, Christopher
Russell, - Sunando Sengupta, Yalin Bastanlar,
- William Clocksin, Philip H.S. Torr
Oxford Brookes University
http//cms.brookes.ac.uk/research/visiongroup/
2Joint Object Class Segmentationand Dense Stereo
Reconstruction
Black Box Solver
Left Camera Image
Object Class Segmentation
Right Camera Image
Dense Stereo Reconstruction
3Joint Object Class Segmentationand Dense Stereo
Reconstruction
Objective Joint Estimation
Black Box Solver
Left Camera Image
Object Class Segmentation
Right Camera Image
Dense Stereo Reconstruction
4Dense Stereo Reconstruction
- For each pixel assigns a disparity label y
- Disparities from the discrete set 0, 1, .. D
Left Camera Image
Right Camera Image
Dense Stereo Result
5Dense Stereo Reconstruction
Unary Potential
Disparity 0
Unary Cost dependent on the similarity of
patches, e.g.cross correlation
6Dense Stereo Reconstruction
Unary Potential
Disparity 5
Unary Cost dependent on the similarity of
patches, e.g.cross correlation
7Dense Stereo Reconstruction
Unary Potential
Disparity 10
Unary Cost dependent on the similarity of
patches, e.g.cross correlation
8Dense Stereo Reconstruction
Unary Potential
Disparity 15
Unary Cost dependent on the similarity of
patches, e.g.cross correlation
9Dense Stereo Reconstruction
Pairwise Potential
- Encourages label consistency in adjacent pixels
- Cost based on the distance of labels
Linear Truncated
Quadratic Truncated
10Dense Stereo Reconstruction
- Graph-Cut based Range-move inference
- (Kumar et al. NIPS09, Veksler et al. CVPR09)
Original Image
Initial Solution
11Dense Stereo Reconstruction
- Graph-Cut based Range-move inference
- (Kumar et al. NIPS09, Veksler et al. CVPR09)
Original Image
Initial Solution
After 1st expansion
Final solution
12Dense Stereo Reconstruction
- Graph-Cut based Range-move inference
- (Kumar et al. NIPS09, Veksler et al. CVPR09)
Original Image
Initial Solution
After 1st expansion
After 2nd expansion
13Dense Stereo Reconstruction
- Graph-Cut based Range-move inference
- (Kumar et al. NIPS09, Veksler et al. CVPR09)
Original Image
Initial Solution
After 1st expansion
After 2nd expansion
After 3rd expansion
14Dense Stereo Reconstruction
- Graph-Cut based Range-move inference
- (Kumar et al. NIPS09, Veksler et al. CVPR09)
Original Image
Initial Solution
After 1st expansion
After 2nd expansion
After 3rd expansion
Final solution
15Dense Stereo Reconstruction
Does not work for Road Scenes !
Dense Stereo Reconstruction
Original Image
16Dense Stereo Reconstruction
Does not work for Road Scenes !
Patches can be matched to any other patch for
flat surfices
Different brightness in cameras
17Dense Stereo Reconstruction
Does not work for Road Scenes !
Patches can be matched to any other patch for
flat surfices
Different brightness in cameras
Could object recognition for road scenes help?
Recognition of road scenes is relatively easy
(Sturgess et al., BMVC09)
18Object Class Segmentation
- Aims to assign a class label for each pixel of an
image - Classifier trained on the training set
- Evaluated on never seen test images
19Object Class Segmentation
Unary Potential
- Likelihood of a pixel taking a label
- (Shotton et al. ECCV06, He et al, CVPR04,
Ladický et al. ICCV 09)
20Object Class Segmentation
Pairwise Potential
- Contrast sensitive Potts model
- Encourages label consistency in adjacent pixels
21Object Class Segmentation
Higher Order Potential
- Encouraging consistency in superpixels
- (Kohli et al. CVPR08)
- Merging information at different scales
- (Ladický et al. ICCV09)
22Object Class Segmentation
- Graph-Cut based a-Expansion inference
- (Boykov et al. ICCV99)
grass
Original Image
Initial solution
23Object Class Segmentation
- Graph-Cut based a-Expansion inference
- (Boykov et al. ICCV99)
grass
building
grass
Original Image
Initial solution
Building expansion
24Object Class Segmentation
- Graph-Cut based a-Expansion inference
- (Boykov et al. ICCV99)
grass
building
grass
Original Image
Initial solution
Building expansion
sky
building
grass
Sky expansion
25Object Class Segmentation
- Graph-Cut based a-Expansion inference
- (Boykov et al. ICCV99)
grass
building
grass
Original Image
Initial solution
Building expansion
sky
sky
tree
building
building
grass
grass
Sky expansion
Tree expansion
26Object Class Segmentation
- Graph-Cut based a-Expansion inference
- (Boykov et al. ICCV99)
grass
building
grass
Original Image
Initial solution
Building expansion
sky
sky
sky
building
tree
tree
building
building
aeroplane
grass
grass
grass
Sky expansion
Tree expansion
Final Solution
27Object Class Segmentation vs.Dense Stereo
Reconstruction
?
- Object class and 3D location are mutually
informative - Sky always in infinity (disparity 0)
sky
28Object Class Segmentation vs.Dense Stereo
Reconstruction
- Object class and 3D location are mutually
informative - Sky always in infinity (disparity 0)
- Cars, buildings pedestrians have their typical
height
building
car
sky
29Object Class Segmentation vs.Dense Stereo
Reconstruction
- Object class and 3D location are mutually
informative - Sky always in infinity (disparity 0)
- Cars, buses pedestrians have their typical
height - Road and pavement on the ground plane
road
building
car
sky
30Object Class Segmentation vs.Dense Stereo
Reconstruction
- Object class and 3D location are mutually
informative - Sky always in infinity (disparity 0)
- Cars, buses pedestrians have their typical
height - Road and pavement on the ground plane
- Buildings and pavement on the sides
-
road
building
car
sky
31Object Class Segmentation vs.Dense Stereo
Reconstruction
- Object class and 3D location are mutually
informative - Sky always in infinity (disparity 0)
- Cars, buses pedestrians have their typical
height - Road and pavement on the ground plane
- Buildings and pavement on the sides
- Both problems formulated as CRF
- Joint approach possible?
road
building
car
sky
32Joint Formulation
- Each pixels takes label zi xi yi ? L1 ? L2
- Dependency of xi and yi encoded as a unary and
pairwise potential, e.g. - strong correlation between x road, y near
ground plane - strong correlation between x sky, y 0
- Correlation of edge in object class and disparity
domain
33Joint formulation
Unary Potential
Object layer
Joint unary links
Disparity layer
- Weighted sum of object class, depth and joint
potential - Joint unary potential based on histograms of
height
34Joint Formulation
Pairwise Potential
Object layer
Joint pairwise links
Disparity layer
-
- Object class and depth edges correlated
- Transitions in depth occur often at the object
boundaries
35Joint Formulation
36Inference
- Standard a-expansion
- Each node in each expansion move keeps its old
label or takes a new label xL1, yL2, - Possible in case of metric pairwise potentials
37Inference
- Standard a-expansion
- Each node in each expansion move keeps its old
label or takes a new label xL1, yL2, - Possible in case of metric pairwise potentials
Too many moves! ( L1 L2 ) Impractical !
38Inference
- Projected move for product label space
- One / Some of the label components remain(s)
constant after the move - Set of projected moves
- a-expansion in the object class projection
- Range-expansion in the depth projection
39Dataset
- Leuven Road Scene dataset
- Contained
- 3 sequences
- 643 pairs of images
- We labelled
- 50 training 20 test images
- Object class (7 labels)
- Disparity (100 labels)
- Available on our website
- http//cms.brookes.ac.uk/research/visiongroup/file
s/Leuven.zip
Left camera
Right camera
Object GT
Disparity GT
40Qualitative results
Original Image
Object GT
Object Result
Disparity GT
Disparity Alone
Disparity Jointly
- Large improvement for dense stereo estimation
- Minor improvement in object class segmentation
41Quantitative disparity results
Dependency of the ratio of correctly labelled
pixels within the maximum allowed error delta
42On-going and Future Work
- Application to monocular sequences
- Making method (close to) real time
- Application to multi-view problems
- Optical flow / motion estimation
43Summary
- First dataset with both object class and
disparity labels - Joint estimation improves significantly disparity
results - Projected moves make inference much faster
- Questions ?