Stereo Vision

About This Presentation

Title:

Stereo Vision

Description:

The recovery of the 3D structure of a scene using two or more images of the 3D ... The term binocular vision is used when two cameras are employed. Stereo setup ... – PowerPoint PPT presentation

Number of Views:341

Avg rating:3.0/5.0

Slides: 89

Provided by: george76

Learn more at: https://www.cse.unr.edu

Category:

more less

Transcript and Presenter's Notes

Title: Stereo Vision

1
Stereo Vision

CS485/685 Computer Vision
Prof. Bebis

2
What is the goal of stereo vision?

The recovery of the 3D structure of a scene using
two or more images of the 3D scene, each acquired
from a different viewpoint in space.
The term binocular vision is used when two
cameras are employed.

3
Stereo setup
Cameras in arbitrary position and orientation
More convenient setup
4
Stereo terminology

Fixation point the point of
intersection of the optical axes.
Baseline the distance between
the centers of projection.
Conjugate pair any point in
the scene that is visible in both
cameras will be projected to a
pair of image points in the two
images (corresponding points).

5
Stereo terminology (contd)

Disparity the distance between
corresponding points when the
two images are superimposed.
Disparity map the disparities of
all points form the disparity map.

6
Triangulation

Determines the position of a point in space by
finding the intersection of the two lines passing
through the center of projection and the
projection of the point in each image.

7
The two problems of stereo

The correspondence problem.
(2) The reconstruction problem.

8
The correspondence problem

Finding pairs of matched points such that each
point in the pair is the projection of the same
3D point.

9
The correspondence problem (contd)

Triangulation depends crucially on the solution
of the correspondence problem!
Ambiguous correspondences may lead to several
different consistent interpretations of the
scene.

10
The reconstruction problem

Given the corresponding points, we compute the
disparity map.
The disparity map can be converted to a 3D map of
the scene assuming that the stereo geometry is
known.

11
Recovering depth (i.e., reconstruction)

Consider recovering the position of P from its
projections pl and pr .

Pl P in left camera coordinates Pr P in right
camera coordinates
right camera Pr(Xr,Yr,Zr)
left camera Pl(Xl,Yl,Zl)
12
Recovering depth (contd)

Suppose that the two cameras are related by the
following transformation
Using Zr Zl Z and Xr Xl - T we have
where dxl xr is the disparity (i.e., the
difference in the position between the
corresponding points in the two images)

i.e., (R, -T) aligns right camera with left
camera
13
Stereo camera parameters

Extrinsic parameters (R, T) describe the
relative position and orientation of the two
cameras.
Intrinsic parameters characterize the
transformation from image plane coordinates to
pixel coordinates, in each camera.

14
Stereo camera parameters (contd)

R and T can be determined from the extrinsic
parameters of each camera

(Tl,Rl) align left camera with world (Tr, Rr)
align right camera with world
(see homework)
15
Correspondence problem

Some points in each image will have no
corresponding points in the other image.
i.e., cameras might have different field of view.
A stereo system must be able to determine the
image parts that should not be matched.

16
Correspondence problem (contd)

Two main approaches
Intensity-based attempt to establish a
correspondence by matching image intensities.
Feature-based attempt to establish a
correspondence by matching a sparse sets of image
features.

17
Intensity-based Methods

Match image sub-windows between the two images
(e.g., using correlation).

18
Correlation-based Methods
19
Correlation-based Methods (contd)

Normalized cross-correlation Normalize c(d) by
subtracting the mean and dividing by the standard
deviation.

20
Correlation-based Methods (contd)

The success of correlation-based methods depends
on whether the image window in one image exhibits
a distinctive structure that occurs infrequently
in the search region of the other image.

21
Correlation-based Methods (contd)

How to choose the window size W?
Too small a window may not capture enough image
structure, and may be too noise sensitive (i.e.,
less distinctive, many false matches).
Too large a window makes matching less sensitive
to noise (desired) but also harder to match.

22
Correlation-based Methods (contd)

An adaptive searching window has been proposed in
the literature (i.e., adapt both shape and size).

T. Kanade and M. Okutomi, A Stereo Matching
Algorithm with an Adaptive Window Theory and
Experiment, IEEE Transactions on Pattern Analysis
and Machine Intelligence, Vol 16 No. 9, 1994.
23
Correlation-based Methods (contd)

How to choose the size and location of R( pl)?
If the distance of the fixating point from the
cameras is much larger than the baseline, the
location of R( pl) can be chosen to be the same
as the location of pl .
The size of R( pl) can be estimated from the
maximum range of disparities (or depths) we
expect to find in the scene.

24
Wide Baseline Stereo Matching

More powerful methods are needed when dealing
with
wide baseline stereo images (i.e., affine
invariant region matching).

T. Tuytelaars and L. Van Gool, "Wide Baseline
Stereo Matching based on Local, Affinely
Invariant Regions" , British Machine Vision
Conference, 2000.
25
Feature-based Methods

Look for a feature in an image that matches a
feature in the other.
Edge points
Line segments
Corners
A set of geometric features are used for
matching.

26
Feature-based Methods (contd)

For example, a line feature descriptor could
contain
the length, l
the orientation, ?
the coordinates of the midpoint, m
the average intensity along the line, i
Similarity measures are based on matching feature
descriptors, e.g., line matching can be done as
follows
where w0, ..., w3 are weights

27
Intensity-based vs feature-based approaches

Intensity-based methods
Provide a dense disparity map.
Need textured images to work well.
Sensitive to illumination changes.
Feature-based methods
Faster than correlation-based methods.
Provide sparse disparity maps.
Relatively insensitive to illumination changes.

28
Structured lighting

Feature-based methods cannot be used when objects
have smooth surfaces or surfaces of uniform
intensity.
Patterns of light can be projected onto the
surface of objects, creating interesting points
even in regions which would be otherwise smooth.

29
Search Region R(pl)

Finding corresponding points is time consuming!
Can we reduce the size of the search region?

YES
30
Epipolar Plane and Lines

Epipolar plane the plane passing through the
centers of projection
and the point in the scene.
Epipolar line the intersection of the epipolar
plane with the
image plane.

31
Epipoles
Left epipole the projection of Or on the left
image plane. Right epipole the projection of
Ol on the right image plane.
32
Epipoles (contd)
All epipolar lines go through the cameras epipole
33
Epipoles (contd)
If the line through the center of projection is
parallel to one of the image planes, the
corresponding epipole is at infinity.
two epipoles at infinity
one epipole at infinity
34
Epipolar constraint

Given pl , P can lie anywhere on the ray from Ol
through pl .
The image of this ray in the right image image is
the epipolar line through the corresponding point
pr .

Mapping between points in the left image and
lines in the right image and vice versa.
35
Importance of epipolar constraint

The search for correspondences is reduced to 1D.
Very effective for rejecting false matches due to
occlusion.

pl
R(pl)
Il
Ir
36
Epipolar Lines Example
37
Ordering of corresponding points

Conjugate points along corresponding epipolar
lines have the same order in each image.
Exception when corresponding points lie on the
same epipolar plane and imaged from different
sides.

38
Estimating epipolar geometry

Given a point in one image, how do we find its
epipolar line in the other image?
Need to estimate the epipolar geometry which is
encoded by two matrices
Essential matrix
Fundamental matrix

39
Quick Review

Cross product
Homogeneous representation of lines

40
Vector Cross Product
41
Homogeneous (projective) representation of lines

A line ax by c 0 is represented by the
homogeneous vector below (projective line)
Any vector represents the same line.

42
Homogeneous (projective) representation of lines

Duality in homogeneous (projective) coordinates,
points and lines are dual (we can interchange
their roles).
Some properties involving points and lines
(1) The point x lies on the line iff xT l 0
(2) Two lines l, m define a point p p l x m
(3) Two points p, q define a line l l p
x q

43
Homogeneous (projective) representation of
linesQuick Review
44
Homogeneous (projective) representation of lines
45
Homogeneous (projective) representation of lines
w0, therefore, they intersect at infinity!
46
The essential matrix E

The equation of the epipolar plane is given by
the following co-planarity condition (assume that
the world coordinate system is aligned with the
left camera)

normal to epipolar plane
47
The essential matrix E (contd)

Using Pr R(Pl - T) we have
Expressing cross product as
matrix multiplication we have

(RTPr)T(T x Pl) 0
where E RS is called the essential matrix.
(the matrix S is always rank deficient, i.e.,
rank(S) 2)
48
The essential matrix E (contd)

Using image plane coordinates
The equation defines a mapping
between points and epipolar lines (in homogeneous
coordinates).

plfPl/Zl prfPr/Zr ZlZr
We will revisit this later
49
The essential matrix E (contd)

Properties of the essential matrix E
(1) Encodes the extrinsic parameters only
(2) Has rank 2
(3) Its two nonzero singular values are equal

50
The fundamental matrix F

Suppose that Ml and Mr are the matrices of the
intrinsic parameters of the left and right
camera, then the pixel coordinates and
of pl and pr are
Using the above equations and
we have
where
is called the
fundamental matrix

51
The fundamental matrix F (contd)

Properties of the fundamental matrix
Encodes both the extrinsic and intrinsic
parameters
(2) Has rank 2

52
Estimating F (or E) eight-point algorithm

We can estimate the fundamental matrix from point
correspondences only (i.e., without information
at all on the extrinsic or intrinsic camera
parameters).
Each correspondence leads to a homogeneous
equation of the form

or
53
Estimating F (or E) the eight-point algorithm
(contd)

We can determine the entries of the matrix F (up
to an unknown scale factor) by establishing
correspondences
A is rank deficient (i.e., rank(A) 8)
The solution is unique up to a scale factor
(i.e., proportional to the last column of V where
A UDVT ).

54
Estimating F (or E) the eight-point algorithm
(contd)
(i.e., corresponding to the smallest singular
value)
55
Estimating F (or E) the eight-point algorithm
(contd)
Uncorrected F
Corrected F
Epipolar lines must intersect at the epipole!
56
Estimating F (or E) the eight-point algorithm
(contd)
Normalization
R. Hartley, "In Defense of the Eight-Point
Algorithm", IEEE Transactions on Pattern Analysis
and Machine Intelligence, 19(6) 580-593, 1997.
57
Finding the epipolar lines

The equation below defines a mapping between
points and epipolar lines

Reminder a point x lies on a line l iff xT l
0
58
Finding the epipolar lines (contd)
59
Locating the epipoles from F (or E)
60
Locating the epipoles from F (or E) (contd)
The solution is proportional to the last column
of U in the SVD decomposition of F
61
Rectification
Epipolar lines are colinear and parallel to
baseline
Epipolar lines are at arbitrary locations and
orientations
62
Rectification (contd)

Rectification is a transformation which makes
pairs of conjugate epipolar lines become
collinear and parallel to the horizontal axis
(i.e., baseline)
Searching for corresponding points becomes much
simpler for the case of rectified images

63
Rectification (contd)

Disparities between the
images are in the x-direction
only (i.e., no y disparity)

64
Rectification Example
before rectification
after rectification
65
Rectification (contd)

Main steps
(assuming knowledge of the extrinsic/intrinsic
stereo parameters)
(1) Rotate the left camera so
that the epipolar lines become
parallel to the horizontal axis
(i.e., epipole is mapped to infinity).

66
Rectification (contd)

(2) Apply the same rotation to
the right camera to recover the
original geometry.
(3) Align right camera with left camera using
rotation R.
(4) Adjust scale in both camera frames.

67
Rectification (contd)

Consider step (1) only (i.e., other steps are
easy)
Construct a coordinate system (e1, e2, e3)
centered at Ol .
Aligning it with image plane coordinate system.

68
Rectification (contd)

(1.1) e1 is a unit vector along the vector T
(baseline)

69
Rectification (contd)

(1.2) e2 must be perpendicular to e1 (i.e., cross
product of e1 with the optical axis)

z0,0,1T
70
Rectification (contd)

(1.3) choose e3 as the cross product of e1 and e2

71
Rectification (contd)
The rotation matrix that maps the left epipole to
infinity is the transformation that aligns (e1,
e2, e3) with (i ,j, k)
72
The reconstruction problem

Both intrinsic and extrinsic parameters are
known we can solve the reconstruction problem
unambiguously by triangulation.
(2) Only the intrinsic parameters are known we
can solve the reconstruction problem only up to
an unknown scaling factor.
(3) Neither the extrinsic nor the intrinsic
parameters are available we can solve the
reconstruction problem only up to an unknown,
global projective transformation.

73
(1) Reconstruction by triangulation

Assumptions and problem statement
(1) Both the extrinsic and intrinsic camera
parameters are known.
(2) Compute the location of the 3D points from
their projections pl and pr

74
What is the solution?

The point P lies at the intersection of the two
rays from Ol through pl and from Or through pr

75
Practical issues

The two rays will not intersect exactly in space
because of errors in the location of the
corresponding features.
Find the point that is closest to both rays
(i.e., midpoint P of line segment being
perpendicular to both ray)s.

76
Parametric line representation

The parametric representation of the line passing
through P1 and P2 is given by
The direction of the line is given by the vector
t(P2 - P1)

t ? 0,1
77
Computing P

Parametric representation of the line l passing
through Ol and pl
Parametric representation of the line r passing
through Or and pr

since
78
Computing P (contd)

Suppose the P1 and P2 are given by
The parametric equation of the
line passing through P1 and P2 is
given by
The desired point P (midpoint)
is computed for c 1/2

l apl r TbRTpr
79
Computing a0 and b0

Consider the vector w orthogonal to both l and r
is given by
The line s going through P1 with
direction w is given by

s P1cw or
80
Computing a0 and b0 (contd)

The lines s and r intersect at P2
We can obtain a0 and b0 by
solving the following system
of equations

P1c0w P2 or
(assume s passes through P2 for c c0)
81
(2) Reconstruction up to a scale factor

Only the intrinsic camera parameters are known.
We cannot recover the true scale of the viewed
scene since we do not know the baseline T
(recall that Z fT/d).
Reconstruction is unique only up to an unknown
scaling factor.
This factor can be determined if we know the
distance between two points in the scene.

82
(2) Reconstruction up to a scale factor (contd)

Estimate E
Estimate E using the 8-point algorithm.
The solution is unique up to an unknown scale
factor.
Recover T (from E)

83
(2) Reconstruction up to a scale factor (contd)

To simplify the recovery of T, consider
where
Recover R (from E)
It can be shown that

84
(2) Reconstruction up to a scale factor (contd)

Ambiguity in

- The equations for recovering are
quadratic - The sign of is not fixed (from
SVD)

The ambiguity will be resolved during 3D
reconstruction
- Only one combination of gives
consistent reconstructions (positive Zl,Zr)

85
(2) Reconstruction up to a scale factor (contd)

3D reconstruction

86
(2) Reconstruction up to a scale factor (contd)

3D reconstruction

87
(2) Reconstruction up to a scale factor (contd)
Algorithm
Note the algorithm should not go through more
than 4 iterations
88
(3) Reconstruction up to a projective
transformation