Title: Stereo and Multiview Sequence Processing
1Stereo and Multiview Sequence Processing
2Outline
- Stereopsis
- Stereo Imaging Principle
- Disparity Estimation
- Intermediate View Synthesis
- Stereo Sequence Coding
3Stereopsis
- Retinal disparity
- The horizontal distance between the corresponding
left and right image points of the superimposed
retinal images. - The disparity is zero if the eyes are converged.
- Stereopsis
- The sense of depth combined from two different
perspective views by the mind.
4Stereo Imaging Principle (1)
- Arbitrary Camera Configuration
Xl RlX Tl, Xr RrX Tr
12.2.1
(Rl , Rr Orthonormal)
Cw
Xr RrlXl Trl, where Rrl RrRlT, Trl
Tr RrRlTTl
12.2.2
12.2.3
(Perspective projection)
12.2.4
?
12.2.5
Given xl and xr ? Zr and Zl ? Xl, Xr, Yl, Yr ?
(X, Y, Z)
Cl
Cr
5Stereo Imaging Principle (2)
- Parallel Camera Configuration
12.2.6
12.2.7
12.2.8
12.2.9
3-D view
X-Z view (Y0)
6Results of eq. 12.2.8
- Basis for derive the depth from the disparity
info - The disparity value of a 3-D point (X, Y, Z) is
independent of the X and Y coordinates, and is
inversely proportional to the Z value. - The range of the disparity increases with the
baseline B, the distance between the two cameras. - ?dx gt 0
7Stereo Imaging Principle (3)
- Converging Camera Configuration
12.2.10
12.2.11
? 12.2.2 and 12.2.4 ?
12.2.12
3-D view
X-Z view (Y0)
8Stereo Imaging Principle (4)
- Epipolar Geometry
- Epipolar Constraint
- For any imaged point that falls on the left
epipolar line, its corresponding pixel in the
right image must be on the right epipolar line - Fundamental matrix
- The relation between an image point and its
epipolar line can be characterized by a 3 by 3
matrix, F -
? Epipolar plane epl , epr Epipolar line el
Left epipole er Right epipole
?l
?r
9Stereo Imaging Principle (4)
- Parallel camera
- Epipoles are at infinity, and epipolar lines are
parallel - For any given point, the left and right epipolar
lines associated with this point are horizontal
lines with the same y coordinate as this point - This can simplify the
- disparity estimation
- problem
10Disparity Estimation (1)
- Constraints on Disparity Distribution
- Epipolar constraint
- Unidirectionality with parallel cameras
- With the parallel camera configuration, the DV
has only horizontal components and is always
positive. - Ordering constraint
- Let xr,1 and xr,2 be two points in the right
image on the same horizontal line. - xr,1lt xr,2 ? xl,1lt xl,2 ? dx,2 gt dx,1xr,1- xr,2
- xl,2 gt xl,1 ? xl,2 - xr,2 gt xl,1 - xr,2 ?
dx,2 gt xl,1 - xr,1 xr,1- xr,2
11Disparity Estimation (2)
- Models for the Disparity Function
- A simple case The surface of the imaged scene
is approximated by a plane. -
-
-
-
12.3.1
? 12.2.6 and 12.2.7 ?
12.3.2
12.3.3
?
The disparity function is affine in the image
coordinate when the surface is a plane
12.3.4
12Patch the planar condition holds
- Divided into small patches such that each patch
is approximately planar - The disparity estimation problem
- ? the estimation of three affine parameters for
each patch - ? the estimation of the disparity (dx only) at
three corner points - ? the estimation the disparity at nodal points,
and the disparity function within each patch can
then be interpolated form the nodal points using
the affine model
13Disparity Estimation (3)
- Block-Based Approach
- A disparity function is described by a constant
or a low-order polynomial - determined by minimizing the error between the
two views after warping, based on the estimated
disparity function - Solved by exhaustive or gradient-descent search
with constraints listed in Page 10. - Search range should be much larger.
- This model is only appropriate for the flat
surface that is parallel with the image plane. - This model is good when the block size is small.
14Disparity Estimation (4)
- Two-dimensional mesh-based approach
X
Finding nodal displacements by minimizing the
disparity-compensated prediction error between
corresponding elements, summed over the FOUR
elements attached to this node
xl
xr
Bm,l
Bm,r
Parallel set-up, only horizontal disparities must
be searched
15Original left
Original right
Regular mesh on the left image
Corresponding mesh on the right image
Predictive right image by mesh (27.48 dB)
Predictive right image by BMA (32.03 dB)
The mesh-based scheme yields a visually more
accurate prediction
16Disparity Estimation (5)
of edge points in the right image
- Intra-Line Edge Matching Using Dynamic
Programming - The stereo matching process can be considered as
finding a path in a graph.
Right scan line
Left scan line
of edge points in the left image
17Disparity Estimation (6)
- Joint Structure and Motion Estimation
- Modeling the surface of the imaged object with a
3-D mesh.
The 3-D mesh projects to 2-D meshes in the left
and right images
18Intermediate View Synthesis (1)
- Naïve approach
- Linear interpolation without considering
disparity -
-
- Dcl is the baseline distance from the central
to the left view - yielding blurred images
19Intermediate View Synthesis (2)
- Disparity-compensated interpolation
-
-
Suppose dcl(x) and dcr(x) are known
In reality, only dlr(x) can be estimated. It is
not easy to generate dcl(x) and dcr(x) from dlr(x)
20Intermediate View Synthesis (3)
- Solved if dlr(x) is estimated by the mesh-based
approach -
21Stereo Sequence Coding (1)
- Multiview profile of MPEG-2
- Coding left view seq. Sl, first, for the right
view seq., each frame is predicated from the
corresponding frame in Sl, based on an estimated
disparity field and the prediction error image
are coded.
P
B
B
B
Right view
I
B
B
P
Left view
22Stereo Sequence Coding (2)
- Incomplete 3-D representation of multiview
sequences augmented text map, region
segmentation, disparity info for each region - Putting the texture maps of all the different
regions in an augmented image.
Original left
Original right
Augmented texture
Disparity map
23Stereo Sequence Coding (3)
- Mixed-resolution coding
- Based on the HVS, the resolution of one of the
two images can be considerably reduced when the
image is presented for a short time - One of the left and right sequences is coded at a
high resolution, while the other is first
down-sampled spatially and temporally, then coded
24Stereo Sequence Coding (4)
Left and right sequences
Shape and motion bits
Object segmentation
Shape and motion parameter coding
Reference texture image extraction
Motion and structure estimation
Texture bits
Reference texture image coding
Coded view synthesis
Synthesis error bits
Synthesis error image coding
253-D object-based coding
- Instead of deriving 2-D motion and disparity for
performing MCP and DCP, 3-D structure and motion
parameters are estimated from the stereo or
multiple views - The structure, motion, and surface texture of
each object are coded, instead of individual
image frames - At the decoder, desired views are synthesized
- Advantages
- accurate 3-D estimation
- with the 3-D info derived from the stereo pair,
one can generate any intermediate view - the coded 3-D info enables manipulation of the
imaged object and scene - Wire-framed object, nodal positions, nodal
displacement vectors, segmentation map, I3D
26Stereo Sequence Coding (5)
- 3-D model-based coding
- It is very difficult to derive the 3-D structure
of the objects in a scene automatically - Building a generic model for each potential
object - Only a few objects are in the scene
- ex. Teleconferencing applications
- Pre-designed generic face and body models can be
used