Stereo Vision - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Stereo Vision

Description:

University of Texas at Arlington. Image Projection Review ... Also called stereopsis. Key idea: Each point in an image corresponds to a line in the 3D world. ... – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 37
Provided by: vassilis
Category:

less

Transcript and Presenter's Notes

Title: Stereo Vision


1
  • Lecture 25
  • Stereo Vision

CSE 4392/6367 Computer Vision Spring
2009 Vassilis Athitsos University of Texas at
Arlington
2
Image Projection Review
  • Let A , R , T ,
    H .
  • P(A) R T A gives us the projection (in
    world coordinates) of A on an image plane of what
    focal length?

3
Image Projection Review
  • Let A , R , T ,
    H .
  • P(A) R T A gives us the projection (in
    world coordinates) of A on an image plane focal
    length 1.
  • H(P(A)) gives us the pixel coordinates
    corresponding to P(A). For simplicity, the focal
    length is encoded in H.

4
Image-to-World Projection
  • Let A , R , T ,
    H .
  • Given pixel location W (u, v), how can we get
    the world coordinates of the corresponding
    position on the image plane?

5
Image-to-World Projection
  • Let A ,R ,T ,
    H .
  • Define G . G maps (x0, y0,
    1)trans to (u, v, 1)trans.
  • (x0, y0) are the normalized image coordinates
    corresponding to (u, v).

6
Image-to-World Projection
  • Let A , R , T ,
    H .
  • Define G . G maps (x0, y0,
    1)trans to (u, v, 1)trans.
  • (x0, y0) are the normalized image coordinates
    corresponding to (u, v).
  • G-1 maps (u, v) to normalized image coordinates.

7
Image-to-World Projection
  • Define G . G maps (x0, y0,
    1)trans to (u, v, 1)trans.
  • (x0, y0) are the normalized image coordinates
    corresponding to (u, v).
  • G-1 maps (u, v) to normalized image coordinates
    (x0, y0).
  • In camera coordinates, what is the z coordinate
    of G-1(u, v)?

8
Image-to-World Projection
  • Define G . G maps (x0, y0,
    1)trans to (u, v, 1)trans.
  • (x0, y0) are the normalized image coordinates
    corresponding to (u, v).
  • G-1 maps (u, v) to normalized image coordinates
    (x0, y0).
  • In camera coordinates, what is the z coordinate
    of G-1(u, v)?
  • Remember, G-1 maps pixels into an image plane
    corresponding to focal length ?

9
Image-to-World Projection
  • Define G . G maps (x0, y0,
    1)trans to (u, v, 1)trans.
  • (x0, y0) are the normalized image coordinates
    corresponding to (u, v).
  • G-1 maps (u, v) to normalized image coordinates
    (x0, y0).
  • In camera coordinates, what is the z coordinate
    of G-1(u, v)?
  • Remember, G-1 maps pixels into an image plane
    corresponding to focal length 1.

10
Image-to-World Projection
  • Define G . G maps (x0, y0,
    1)trans to (u, v, 1)trans.
  • (x0, y0) are the normalized image coordinates
    corresponding to (u, v).
  • G-1 maps (u, v) to normalized image coordinates
    (x0, y0).
  • In camera coordinates, what is the z coordinate
    of G-1(u, v)? z -1.
  • Remember, G-1 maps pixels into an image plane
    corresponding to focal length f 1.

11
Image-to-World Projection
  • Now we have mapped pixel (u, v) to image plane
    position (x0, y0, -1).
  • Next step map image plane position to position
    in the world.
  • First in camera coordinates.
  • What world position does image plane position
    (x0, y0, -1) map to?

12
Image-to-World Projection
  • Now we have mapped pixel (u, v) to image plane
    position (x0, y0, -1).
  • Next step map image plane position to position
    in the world.
  • First in camera coordinates.
  • What world position does image plane position
    (x0, y0, -1) map to?
  • (x0, y0, -1) maps to a line. In camera
    coordinates, the line goes through the origin.
  • How can we write that line in camera coordinates?

13
Image-to-World Projection
  • (x0, y0, -1) maps to a line. In camera
    coordinates, the line goes through the origin.
  • How can we write that line in camera coordinates?
  • Suppose that the line goes through point (x, y,
    z). What equations does that point have to
    satisfy?
  • x / x0 z / (-1) gt z x (-1)/x0.
  • y / y0 x / x0 gt y x y0/x0.
  • These equations define a line (y, z) f(x).
    Borderline cases x0 0, y0 0.

14
Image-to-World Projection
  • (x0, y0, -1) maps to a line. In camera
    coordinates, the line goes through the origin.
  • Suppose that the line goes through point (x, y,
    z). What equations does that point have to
    satisfy?
  • x / x0 z / (-1) gt z x (-1)/x0.
  • y / y0 x / x0 gt y x y0/x0.
  • These equations define a line (y, z) f(x).
    Borderline cases x0 0, y0 0.
  • Given a point on this line, how do we map it to
    world coordinates?

15
Image-to-World Projection
  • (x0, y0, -1) maps to a line. Suppose that the
    line goes through point (x, y, z). What equations
    does that point have to satisfy?
  • x / x0 z / (-1) gt z x (-1)/x0.
  • y / y0 x / x0 gt y x y0/x0.
  • These equations define a line (y, z) f(x).
    Borderline cases x0 0, y0 0.
  • Given a point on this line, how do we map it to
    world coordinates?
  • World-to-camera mapping of A is done by camera(A)
    RTA.
  • Camera-to-world mapping is done by T-1 R-1
    camera(A).

16
Stereo Vision
  • Also called stereopsis.
  • Key idea
  • Each point in an image corresponds to a line in
    the 3D world.
  • To compute that line, we need to know the camera
    matrix.
  • If the same point is visible from two images, the
    two corresponding lines intersect in a single 3D
    point.
  • Challenges
  • Identify correspondences between images from the
    two cameras.
  • Compute the camera matrix.

17
A Simple Stereo Setup
  • Simple arrangement
  • Both cameras have same intrinsic parameters.
  • Image planes belong to the same world plane.
  • Then, correspondences appear on the same
    horizontal line.
  • The displacement from one image to the other is
    called disparity.
  • Disparity is proportional to depth.
  • External calibration parameters are not needed.

18
A Simple Stereo Setup
  • Assume that
  • Both cameras have pinholes at z0, y 0.
  • Both image planes correspond to f1, z -1.
  • Both cameras have the same intrinsic parameters
    f, Sx, Sy, u0, v0.
  • Both camera coordinate systems have the same x,
    y, z axes.
  • Cameras only differ at the x coordinate of the
    pinhole.
  • Camera 1 is at (x1, 0, 0), camera 2 is at (x2, 0,
    0).
  • Then
  • Suppose a point A is at (xA, yA, zA).
  • On camera 1, A maps to normalized image
    coordinates

19
A Simple Stereo Setup
  • Assume that
  • Both cameras have pinholes at z0, y 0.
  • Both image planes correspond to f1, z -1.
  • Both cameras have the same intrinsic parameters
    f, Sx, Sy, u0, v0.
  • Both camera coordinate systems have the same x,
    y, z axes.
  • Cameras only differ at the x coordinate of the
    pinhole.
  • Camera 1 is at (x1, 0, 0), camera 2 is at (x2, 0,
    0).
  • Then
  • Suppose a point A is at (xA, yA, zA).
  • On camera 1, A maps to normalized image
    coordinates
  • (x1A, y1A) ((xA x1) / zA, yA / zA)
  • On camera 2, A maps to normalized image
    coordinates

20
A Simple Stereo Setup
  • Assume that
  • Both cameras have pinholes at z0, y 0.
  • Both image planes correspond to f1.
  • Both cameras have the same intrinsic parameters
    f, Sx, Sy, u0, v0.
  • Both camera coordinate systems have the same x,
    y, z axes.
  • Cameras only differ at the x coordinate of the
    pinhole.
  • Camera 1 is at (x1, 0, 0), camera 2 is at (x2, 0,
    0).
  • Then
  • Suppose a point A is at (xA, yA, zA).
  • On camera 1, A maps to normalized image
    coordinates
  • (x1A, y1A) ((xA x1) / zA, yA / zA)
  • On camera 2, A maps to normalized image
    coordinates
  • (x2A, y2A) ((xA x2) / zA, yA / zA)
  • (x1A x2A) ((xA x1) (xA x2)) / zA (x2
    x1) / zA c / zA.
  • (x1A x2A) is called disparity. Disparity is
    inversely proportional to zA.

21
A Simple Stereo Setup
  • Suppose a point A is at (xA, yA, zA).
  • On camera 1, A maps to normalized image
    coordinates
  • (x1A, y1A) ((xA x1) / zA, yA / zA)
  • On camera 2, A maps to normalized image
    coordinates
  • (x2A, y2A) ((xA x2) / zA, yA / zA)
  • (x1A x2A) ((xA x1) (xA x2)) / zA (x2
    x1) / zA c / zA.
  • (x1A x2A) is called disparity. Disparity is
    inversely proportional to zA.
  • If we know (x1A, y1A) and (x2A, y2A) (i.e., we
    know the locations of A in each image), what else
    do we need to know in order to figure out zA?

22
A Simple Stereo Setup
  • Suppose a point A is at (xA, yA, zA).
  • On camera 1, A maps to normalized image
    coordinates
  • (x1A, y1A) ((xA x1) / zA, yA / zA)
  • On camera 2, A maps to normalized image
    coordinates
  • (x2A, y2A) ((xA x2) / zA, yA / zA)
  • (x1A x2A) ((xA x1) (xA x2)) / zA (x2
    x1) / zA c / zA.
  • (x1A x2A) is called disparity. Disparity is
    inversely proportional to zA.
  • If we know (x1A, y1A) and (x2A, y2A) (i.e., we
    know the locations of A in each image), what else
    do we need to know in order to figure out zA?
  • We need to know c (x2 x1).

23
A More General Case
  • Suppose that we start with the simple system
  • Both cameras have pinholes at z0, y 0.
  • Both image planes correspond to f1, z-1.
  • Both cameras have the same intrinsic parameters
    f, Sx, Sy, u0, v0.
  • Both camera coordinate systems have the same x,
    y, z axes.
  • Cameras only differ at the x coordinate of the
    pinhole.
  • Camera 1 is at (x1, 0, 0), camera 2 is at (x2, 0,
    0).
  • Then we rotate by R and translate by T the whole
    system.
  • To find point A, we just need to
  • go back to simple coordinates, by translating
    back and rotating back. This is done via matrix ?

24
A More General Case
  • Suppose that we start with the simple system
  • Both cameras have pinholes at z0, y 0.
  • Both image planes correspond to f1, z-1.
  • Both cameras have the same intrinsic parameters
    f, Sx, Sy, u0, v0.
  • Both camera coordinate systems have the same x,
    y, z axes.
  • Cameras only differ at the x coordinate of the
    pinhole.
  • Camera 1 is at (x1, 0, 0), camera 2 is at (x2, 0,
    0).
  • Then we rotate by R and translate by T the whole
    system.
  • To find point A, we just need to
  • go back to simple coordinates, by translating
    back and rotating back. This is done via matrix
    R-1 T-1.
  • Find simple(A) in the simplified coordinate
    system.
  • Map A to original world coordinates. A ?

25
A More General Case
  • Suppose that we start with the simple system
  • Both cameras have pinholes at z0, y 0.
  • Both image planes correspond to f1, z-1.
  • Both cameras have the same intrinsic parameters
    f, Sx, Sy, u0, v0.
  • Both camera coordinate systems have the same x,
    y, z axes.
  • Cameras only differ at the x coordinate of the
    pinhole.
  • Camera 1 is at (x1, 0, 0), camera 2 is at (x2, 0,
    0).
  • Then we rotate by R and translate by T the whole
    system.
  • To find point A, we just need to
  • go back to simple coordinates, by translating
    back and rotating back. This is done via matrix
    R-1 T-1.
  • Find simple(A) in the simplified coordinate
    system.
  • simple(A) is just shorthand for the position of A
    in the simplified system.
  • Map A to original world coordinates. A T R
    simple(A).

26
The General Case
  • Given two calibrated cameras, and a corresponding
    pair of locations, we compute two lines.
  • In the mathematically ideal case, the lines
    intersect.
  • By finding the intersection, we compute where the
    3D location is.

27
The General Case
  • Given two calibrated cameras, and a corresponding
    pair of locations, we compute two lines.
  • In the mathematically ideal case, the lines
    intersect.
  • In practice, they dont intersect because of
    rounding/measurement errors (pixels are
    discretized).
  • Best estimate for the 3D point is obtained by
  • Finding the shortest line segment that connects
    the two lines.
  • Returning the midpoint of that segment.

28
Finding Connecting Segment
  • ((P1 a1u1) (Q1 a2u2)) u1 0
  • ((P1 a1u1) (Q1 a2u2)) u2 0
  • here stands for dot product.
  • P1 point on first line.
  • Q1 point on second line.
  • u1 unit vector parallel to first line.
  • u2 unit vector parallel to second line.
  • P1 a1u1 intersection of segment with first
    line.
  • Q1 a2u2 intersection of segment with second
    line.
  • Only unknowns are a1 and a2.
  • We have two equations, two unknowns, can solve.

29
Essential Matrix
  • We define a stereo pair given two cameras (in an
    arbitrary configuration).
  • The essential matrix E of this stereo pair is a
    matrix that has the following property
  • If W and W are homogeneous normalized image
    coordinates in image 1 and image 2, and these
    locations correspond to the same 3D point, then
    (W)transpose E W 0.

30
Estimating the Essential Matrix
  • The essential matrix E of this stereo pair is a
    matrix that has the following property
  • If W and W are homogeneous normalized image
    coordinates in image 1 and image 2, and these
    locations correspond to the same 3D point, then
    (W)transpose E W 0.
  • E has size 3x3. To estimate E, we need to
    estimate 9 unknowns.
  • Observations
  • A trivial and not useful exact solution is E 0.
  • If E is a solution, then cE is also a solution,
    for any real number c. So, strictly speaking we
    can only solve up to scale, and we only need to
    estimate 8 unknowns.
  • To avoid the E0 solution, we impose an
    additional constraint
  • sum(sum(E.E)) 1.

31
Using a Single Correspondence
  • Suppose (u1, v1, w1) in image plane 1 matches
    (u2, v2, w2) in image plane 2.
  • Remember, (u1, v1, w1) and (u2, v2, w2) are given
    in homogeneous normalized image coordinates.
  • We know that (u1, v1, w1) E (u2, v2,
    w2)transpose 0.
  • Let E .
  • We obtain u1, v1, w1
    u2, v2, w2 0 gt
  • u1e11v1e21w1e31, u1e12v1e22w1e32,
    u1e13v1e23w1e33 u2, v2, w2 0 gt
  • u1u2e11v1u2e21w1u2e31u1v2e12v1v2e22w1v2e32u
    1w2e13v1w2e23w1w2e33 0 gt
  • u1u2,v1u2,w1u2,u1v2,v1v2,w1v2,u1w2,v1w2,w1w2
    e11,e21,e31,e12,e22,e32,e13,e23,e33 0

32
Using Multiple Correspondences
  • From previous slide if (u1, v1, w1) in image
    plane 1 matches (u2, v2, w2) in image plane 2
  • u1u2,v1u2,w1u2,u1v2,v1v2,w1v2,u1w2,v1w2,w1w2
    e11,e21,e31,e12,e22,e32,e13,e23,e33 0
  • If we have J correspondences
  • (u1,j, v1,j, w1,j) in image plane 1 matches
    (u2,j, v2,j, w2,j) in image plane 2
  • Define
  • u1,1u2,1, v1,1u2,1, w1,1u2,1,
    u1,1v2,1, v1,1v2,1, w1,1v2,1, u1,1w2,1,
    v1,1w2,1, w1,1w2,1
  • u1,2u2,2, v1,2u2,2, w1,2u2,2,
    u1,2v2,2, v1,2v2,2, w1,2v2,2, u1,2w2,2,
    v1,2w2,2, w1,2w2,2
  • A u1,3u2,3, v1,3u2,3, w1,3u2,3, u1,3v2,3,
    v1,3v2,3, w1,3v2,3, u1,3w2,3, v1,3w2,3,
    w1,3w2,3
  • u1,Ju2,J, v1,Ju2,J, w1,Ju2,J,
    u1,Jv2,J, v1,Jv2,J, w1,Jv2,J, u1,Jw2,J,
    v1,Jw2,J, w1,Jw2,J

33
Using Multiple Correspondences
  • Using A from the previous slide, the following
    holds
  • A Jx9 matrix.
  • Matrix of unknowns eij size 9x1.
  • Result a zero matrix of size Jx1.
  • This is a system of linear homogeneous equations,
    that can be solved using SVD.
  • In Matlab
  • u, d, v svd(A, 0)
  • x v(, end)
  • After the above two lines, x is the 9x1 matrix of
    unknowns.
  • This way, using multiple correspondences, we have
    computed the essential matrix.
  • Strictly speaking, we have computed one out of
    many essential matrices.
  • Solution up to scale.

34
Epipoles Epipolar Lines
  • In each image of a stereo pair, the epipole is
    the pixel location where the pinhole of the other
    camera is mapped.
  • Given a pixel in an image, where can the
    corresponding pixel be in the other image?
  • The essential matrix defines a line.
  • All such lines are called epipolar lines, because
    they always go through the epipole.
  • Why?

35
Epipoles Epipolar Lines
  • In each image of a stereo pair, the epipole is
    the pixel location where the pinhole of the other
    camera is mapped.
  • Given a pixel in an image, where can the
    corresponding pixel be in the other image?
  • The essential matrix defines a line.
  • All such lines are called epipolar lines, because
    they always go through the epipole.
  • Why?
  • Because for any pixel in image 1, the pinhole of
    camera 1 is a possible 3D location.

36
Epipoles Epipolar Lines
  • In each image of a stereo pair, the epipole is
    the pixel location where the pinhole of the other
    camera is mapped.
  • Given a pixel in an image, where can the
    corresponding pixel be in the other image?
  • The essential matrix defines a line.
  • All such lines are called epipolar lines, because
    they always go through the epipole.
  • Given a pixel in one image, the epipolar line in
    the other image can be computed using the
    essential matrix.
Write a Comment
User Comments (0)
About PowerShow.com