Stereo and Multiview Sequence Processing - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Stereo and Multiview Sequence Processing

Description:

The disparity function is affine in the image coordinate when the surface is a plane ... the estimation of three affine parameters for each patch ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 27
Provided by: kaicha
Category:

less

Transcript and Presenter's Notes

Title: Stereo and Multiview Sequence Processing


1
Stereo and Multiview Sequence Processing
2
Outline
  • Stereopsis
  • Stereo Imaging Principle
  • Disparity Estimation
  • Intermediate View Synthesis
  • Stereo Sequence Coding

3
Stereopsis
  • Retinal disparity
  • The horizontal distance between the corresponding
    left and right image points of the superimposed
    retinal images.
  • The disparity is zero if the eyes are converged.
  • Stereopsis
  • The sense of depth combined from two different
    perspective views by the mind.

4
Stereo Imaging Principle (1)
  • Arbitrary Camera Configuration

Xl RlX Tl, Xr RrX Tr
12.2.1
(Rl , Rr Orthonormal)
Cw
Xr RrlXl Trl, where Rrl RrRlT, Trl
Tr RrRlTTl
12.2.2
12.2.3
(Perspective projection)
12.2.4
?
12.2.5
Given xl and xr ? Zr and Zl ? Xl, Xr, Yl, Yr ?
(X, Y, Z)
Cl
Cr
5
Stereo Imaging Principle (2)
  • Parallel Camera Configuration

12.2.6
12.2.7
12.2.8
12.2.9
3-D view
X-Z view (Y0)
6
Results of eq. 12.2.8
  • Basis for derive the depth from the disparity
    info
  • The disparity value of a 3-D point (X, Y, Z) is
    independent of the X and Y coordinates, and is
    inversely proportional to the Z value.
  • The range of the disparity increases with the
    baseline B, the distance between the two cameras.
  • ?dx gt 0

7
Stereo Imaging Principle (3)
  • Converging Camera Configuration

12.2.10
12.2.11
? 12.2.2 and 12.2.4 ?
12.2.12
3-D view
X-Z view (Y0)
8
Stereo Imaging Principle (4)
  • Epipolar Geometry
  • Epipolar Constraint
  • For any imaged point that falls on the left
    epipolar line, its corresponding pixel in the
    right image must be on the right epipolar line
  • Fundamental matrix
  • The relation between an image point and its
    epipolar line can be characterized by a 3 by 3
    matrix, F

? Epipolar plane epl , epr Epipolar line el
Left epipole er Right epipole
?l
?r
9
Stereo Imaging Principle (4)
  • Parallel camera
  • Epipoles are at infinity, and epipolar lines are
    parallel
  • For any given point, the left and right epipolar
    lines associated with this point are horizontal
    lines with the same y coordinate as this point
  • This can simplify the
  • disparity estimation
  • problem

10
Disparity Estimation (1)
  • Constraints on Disparity Distribution
  • Epipolar constraint
  • Unidirectionality with parallel cameras
  • With the parallel camera configuration, the DV
    has only horizontal components and is always
    positive.
  • Ordering constraint
  • Let xr,1 and xr,2 be two points in the right
    image on the same horizontal line.
  • xr,1lt xr,2 ? xl,1lt xl,2 ? dx,2 gt dx,1xr,1- xr,2
  • xl,2 gt xl,1 ? xl,2 - xr,2 gt xl,1 - xr,2 ?
    dx,2 gt xl,1 - xr,1 xr,1- xr,2

11
Disparity Estimation (2)
  • Models for the Disparity Function
  • A simple case The surface of the imaged scene
    is approximated by a plane.

12.3.1
? 12.2.6 and 12.2.7 ?
12.3.2
12.3.3
?
The disparity function is affine in the image
coordinate when the surface is a plane
12.3.4
12
Patch the planar condition holds
  • Divided into small patches such that each patch
    is approximately planar
  • The disparity estimation problem
  • ? the estimation of three affine parameters for
    each patch
  • ? the estimation of the disparity (dx only) at
    three corner points
  • ? the estimation the disparity at nodal points,
    and the disparity function within each patch can
    then be interpolated form the nodal points using
    the affine model

13
Disparity Estimation (3)
  • Block-Based Approach
  • A disparity function is described by a constant
    or a low-order polynomial
  • determined by minimizing the error between the
    two views after warping, based on the estimated
    disparity function
  • Solved by exhaustive or gradient-descent search
    with constraints listed in Page 10.
  • Search range should be much larger.
  • This model is only appropriate for the flat
    surface that is parallel with the image plane.
  • This model is good when the block size is small.

14
Disparity Estimation (4)
  • Two-dimensional mesh-based approach

X
Finding nodal displacements by minimizing the
disparity-compensated prediction error between
corresponding elements, summed over the FOUR
elements attached to this node
xl
xr
Bm,l
Bm,r
Parallel set-up, only horizontal disparities must
be searched
15
Original left
Original right
Regular mesh on the left image
Corresponding mesh on the right image
Predictive right image by mesh (27.48 dB)
Predictive right image by BMA (32.03 dB)
The mesh-based scheme yields a visually more
accurate prediction
16
Disparity Estimation (5)
of edge points in the right image
  • Intra-Line Edge Matching Using Dynamic
    Programming
  • The stereo matching process can be considered as
    finding a path in a graph.

Right scan line
Left scan line
of edge points in the left image
17
Disparity Estimation (6)
  • Joint Structure and Motion Estimation
  • Modeling the surface of the imaged object with a
    3-D mesh.

The 3-D mesh projects to 2-D meshes in the left
and right images
18
Intermediate View Synthesis (1)
  • Naïve approach
  • Linear interpolation without considering
    disparity
  • Dcl is the baseline distance from the central
    to the left view
  • yielding blurred images

19
Intermediate View Synthesis (2)
  • Disparity-compensated interpolation

Suppose dcl(x) and dcr(x) are known
In reality, only dlr(x) can be estimated. It is
not easy to generate dcl(x) and dcr(x) from dlr(x)
20
Intermediate View Synthesis (3)
  • Solved if dlr(x) is estimated by the mesh-based
    approach

21
Stereo Sequence Coding (1)
  • Multiview profile of MPEG-2
  • Coding left view seq. Sl, first, for the right
    view seq., each frame is predicated from the
    corresponding frame in Sl, based on an estimated
    disparity field and the prediction error image
    are coded.

P
B
B
B
Right view
I
B
B
P
Left view
22
Stereo Sequence Coding (2)
  • Incomplete 3-D representation of multiview
    sequences augmented text map, region
    segmentation, disparity info for each region
  • Putting the texture maps of all the different
    regions in an augmented image.

Original left
Original right
Augmented texture
Disparity map
23
Stereo Sequence Coding (3)
  • Mixed-resolution coding
  • Based on the HVS, the resolution of one of the
    two images can be considerably reduced when the
    image is presented for a short time
  • One of the left and right sequences is coded at a
    high resolution, while the other is first
    down-sampled spatially and temporally, then coded

24
Stereo Sequence Coding (4)
  • 3-D object-based coding

Left and right sequences
Shape and motion bits
Object segmentation
Shape and motion parameter coding
Reference texture image extraction
Motion and structure estimation
Texture bits
Reference texture image coding
Coded view synthesis
Synthesis error bits
Synthesis error image coding
25
3-D object-based coding
  • Instead of deriving 2-D motion and disparity for
    performing MCP and DCP, 3-D structure and motion
    parameters are estimated from the stereo or
    multiple views
  • The structure, motion, and surface texture of
    each object are coded, instead of individual
    image frames
  • At the decoder, desired views are synthesized
  • Advantages
  • accurate 3-D estimation
  • with the 3-D info derived from the stereo pair,
    one can generate any intermediate view
  • the coded 3-D info enables manipulation of the
    imaged object and scene
  • Wire-framed object, nodal positions, nodal
    displacement vectors, segmentation map, I3D

26
Stereo Sequence Coding (5)
  • 3-D model-based coding
  • It is very difficult to derive the 3-D structure
    of the objects in a scene automatically
  • Building a generic model for each potential
    object
  • Only a few objects are in the scene
  • ex. Teleconferencing applications
  • Pre-designed generic face and body models can be
    used
Write a Comment
User Comments (0)
About PowerShow.com