Title: Onto 3D
1Onto 3D
- Coordinate systems
- 3-D homogeneous transformations
- Translation, scaling, rotation
- Changes of coordinates
- Rigid transformations
2Vector Projection
- The projection of vector a onto u is that
component of a in the direction of u
3Vector Cross Product
- Definition If a (xa, ya, za)T and
- b (xb, yb, zb)T, then
- c a X b
- c is orthogonal to both a and b
from Hill
4Coordinate System Definitions
- Let x (x, y, z)T be a point in 3-D space (R3).
What do these values mean? - A coordinate system in Rn is defined by an origin
o and n orthogonal basis vectors - In R3, positive direction of each axis X, Y, Z is
indicated by unit vector i, j, k, respectively,
where k i X j (in a right-handed system) - Coordinate is length of projection of vector from
origin to point onto axis basis vectore.g., x
x i
o
x
53-D Camera Coordinates
- Right-handed system
- From point of view of camera looking out into
scene - X right, X left
- Y down, Y up
- Z in front of camera, Z behind
6Going from 2-D to 3-D
- Points Add z coordinate
- Transformations Become 4 x 4 matrices with extra
row/column for z componente.g., translation
73-D Scaling
83-D Rotations
- In 2-D, we are always rotating in the plane of
the image, but in 3-D the axis of rotation itself
is a variable - Three canonical rotation axes are the
- coordinate axes X, Y, Z
- These are sometimes referred to
- in aviation terms pitch, yaw or heading,
- and roll, respectively
from Hill
Pitch is the angle that its longitudinal axis
(running from tail to nose and along n) makes
with horizontal plane.
from Hill
93-D Euler Rotation Matrices
- Similar to 2-D rotation matrices, but with
coordinate corresponding to rotation axis held
constant - E.g., a rotation about the X axis of µ radians
103-D Rotation Matrices
- General form is
- Properties
- RT R-1
- Preserves vector lengths, angles between vectors
- Upper-left block R33 is orthogonal matrix
- Rows form orthonormal basis (as do columns)
Length 1, mutually orthogonal - So R33 x projects point x onto unit vectors
represented by rows of R33
11Coordinate System Conversion
- Camera coordinates C Origin at center of camera,
Z axis pointed in viewing direction - World coordinates W Arbitrary origin, axes
- Way to specify camera location, orientation (aka
pose) in same frame as scene objects (we like to
move camera to world, so as to convert world
coordinates into camera coordinates) - Cx, Wx, Same point in different coordinates
12Coordinate System Conversion
- Camera coordinates C Origin at center of camera,
Z axis pointed in viewing direction - World coordinates W Arbitrary origin, axes
- Way to specify camera location, orientation (aka
pose) in same frame as scene objects - Cx, Wx, Same point in different coordinates
13Coordinate System Conversion
- Camera coordinates C Origin at center of camera,
Z axis pointed in viewing direction - World coordinates W Arbitrary origin, axes
- Way to specify camera location, orientation (aka
pose) in same frame as scene objects - Cx, Wx, Same point in different coordinates
14Change of Coordinates Special Case of Same Axes
- Distinct origins, parallel basis vectors
If B is world, Ax (camera) can be obtained by
Bx (world) minus its CG.
15Change of Coordinates Special Case of Same Origin
- Just need to rotate basis vectors so that they
are aligned - Rotation matrix is projection of basis vectors in
new frame -
ia.ib ja.ib ka ib -
ia jb ja jb ka jb -
ia kb ja kb ka kb
Check by multing (ib 0 0), etc.
163-D Rigid Transformations
- Combination of rotation followed by translation
without scaling - Moves an object from one 3-D position and
orientation (pose) to another
T
R
M
173-D Transformations Arbitrary Change of
Coordinates
- A rigid transformation can be used to represent a
general change in the coordinate system that
expresses a points location
18Rigid Transformations Homogeneous Coordinates
- Points in one coordinate system are transformed
to the other as follows - takes the camera to the world origin,
transforming world coordinates to camera
coordinates - If A is camera and B is world, inverse
translation - and inverse rotation
19Camera Projection Matrix
- Using homogeneous coordinates, we can describe
perspective projection as the result of
multiplying by a 3 x 4 matrix P -
- (by the rule for converting between
homo-geneous and regular coordinatesthis is
perspective division)
20Camera Projection Matrix Image Offsets
Center of CCD matrix usually does not coincide
with the principal point C0. This adds u0 and v0
to define in pixel units of C0 in retinal
coordinate system.
21Factoring the Camera Matrix
- Another way to write it
- P K ( Id 0 )
Camera calibration matrix
Identity form of rigid transformation (with 4th
row dropped)
22Camera Calibration Matrix
- More general matrix allows
- Image coordinates with an offset origin (e.g.,
convention of upper left corner) - Non-square pixels Different effective
horizontal vs. vertical focal length - These four variables are known as the cameras
intrinsic parameters
fufsu fvfsv
23Dealing with World Coordinates
- Thus far we have assumed that points are in
camera coordinates - Recall the definition of the world-to-camera
coordinate rigid transformation - In simpler form
24Combining Intrinsic Extrinsic Parameters
- The transformation performed by a pinhole camera
on an arbitrary point in world coordinates can be
written as
3 x 4 projective camera matrix P has 10 degrees
of freedom (DOF) 4 intrinsic, 3 rotation, 3
translation
25Skew ignored
- The textbook has skew parameter included (pp.
29). - Since the camera coordinate system may also be
skewed due to some manufacturing error, the angle
? between the two image axes is not equal (maybe
close to 90 degrees). This adds up another
unknown parameter - Easy to incorporate, just makes it 11 unknowns
26Applications
- Estimates of the camera matrix parameters are
critical in order to - Know where the camera is and how it is moving
- Deduce structural characteristics of the scene
(i.e., 3-D information) - Place known objects (e.g., computer graphics)
into a camera image correctly
27Camera Matrix
- Linear systems of equations
- Least-squares estimation
- Application Estimating the camera matrix
28Linear System
- A general set of m simultaneous linear equations
in n variables can be written as
29Matrix Form of Linear System
- This can be represented as a matrix-vector
product - Compactly, we write this as A x b
30Solving Linear Systems
- If m n (A is a square matrix), then we can
obtain the solution by simple inversion - If m gt n, then the system is over-constrained and
A is not invertible - Use the pseudoinverse A (ATA)-1AT to obtain
least-squares solution x Ab
31Fitting Lines
- A 2-D point x (x, y) is on a line with slope m
and intercept b if and only if y mx b - Equivalently,
- So the line defined by two points x1, x2 is the
solution to the following system of equations
32Fitting Lines
- With more than two points, there is no guarantee
that they will all be on the same line - Least-squares solution obtained from
pseudoinverse is line that is closest to all of
the points
courtesy of Vanderbilt U.
33Example Fitting a Line
- Suppose we have points (2, 1), (5, 2), (7, 3),
and (8, 3) - Then
- and x Ab (0.3571, 0.2857)T
34Example Fitting a Line
35Homogeneous Systems of Equations
- Suppose we want to solve A x 0
- There is a trivial solution x 0, but we dont
want this. For what other values of x is A x
close to 0? - This is satisfied by computing the singular value
decomposition (SVD) A UDVT (a non-negative
diagonal matrix between two orthogonal matrices)
and taking x as the last column of V (unit
singular vector - corresponding to the least eigenvalue.
- Note that Matlab returns U, D, V svd(A)
- This is usually subject to constraints such as
norm of x1
36Line-Fitting as a Homogeneous System
- A 2-D homogeneous point x (x, y, 1)T is on the
line l (a, b, c)T only when ax by c
0 - We can write this equation with a dot product x
l 0, and hence the following system is
implied for multiple points x1, x2, ..., xn
37Example Homogeneous Line-Fitting
- Again we have 4 points, but now in homogeneous
form (2, 1, 1), (5, 2, 1), (7, 3, 1), and (8, 3,
1) - Our system is
- Taking the SVD of A, we get
compare to x (0.3571, 0.2857)T
38Camera Calibration
- Camera calibration is the name given to the
process of discovering the projection matrix (and
its decomposition into camera matrix and the
position and orientation of the camera) from an
image of a controlled scene. For ex., we might
set up the camera to view a calibrated grid of
some sort.
39A Vision Problem Estimating P
- Given a number of correspondences between 3-D
points and their 2-D image projections Xi xi,
we would like to determine the camera projection
matrix P such that xi PXi for all i
40A Calibration Target
courtesy of B. Wilburn
41Estimating P The Direct Linear Transformation
(DLT) Algorithm
- xi PXi is an equation involving homogeneous
vectors, so PXi and xi need only be in the same
direction, not strictly equal - We can specify same directionality by using a
cross product formulation
42DLT Camera Matrix Estimation Preliminaries
- Let the image point xi (xi, yi, wi)T (remember
that Xi has 4 elements) - Denoting the jth row of P by pjT (a 4-element row
vector), we have
43DLT Camera Matrix Estimation Step 1
- Then by the definition of the cross product, xi
PXi is
44DLT Camera Matrix Estimation Step 2
- The dot product commutes, so pjT Xi XTi pj, and
we can rewrite the preceding as
45DLT Camera Matrix Estimation Step 3
- Collecting terms, this can be rewritten as a
matrix product - where 0T (0, 0, 0, 0). This is a 3 x 12
matrix times a 12-element column vector p (p1T,
p2T, p3T)T
46What We Just Did
47DLT Camera Matrix Estimation Step 4
- There are only two linearly independent rows here
- The third row is obtained by adding xi times the
first row to yi times the second and scaling the
sum by -1/wi
48DLT Camera Matrix Estimation Step 4
- So we can eliminate one row to obtain the
following linear matrix equation for the ith pair
of corresponding points - Write this as Ai p 0
49DLT Camera Matrix Estimation Step 5
- Remember that there are 11 unknowns which
generate the 3 x 4 homogeneous matrix P
(represented in vector form by p) - Each point correspondence yields 2 equations (the
two row of Ai) - We need at least 5 ½ point correspondences to
solve for p - Stack Ai to get homogeneous linear system A p 0
50Direct Linear Transform (DLT)
rank-2 matrix
51Direct Linear Transform (DLT)
Minimal solution
P has 11 dof, 2 independent eq./points
- 5½ correspondences needed (say 6)
Over-determined solution
n ? 6 points (usually, around 30 points needed?)
use SVD
52Degenerate configurations
- Points are collinear or single line passing
through projection center - Camera and points on a twisted cubic
53Data normalization
- Scale data to values of order 1
- move center of mass to origin
- scale to yield order 1 values
54Geometric error
55Gold Standard algorithm
- Objective
- Given n6 2D to 3D point correspondences
Xi?xi, determine the Maximum Likelyhood
Estimation of P - Algorithm
- Linear solution
- Normalization
- DLT
- Minimization of geometric error using the
linear estimate as a starting point minimize the
geometric error - Denormalization
56Calibration example
- Canny edge detection
- Straight line fitting to the detected edges
- Intersecting the lines to obtain the images
corners - typically precision lt1/10
- (HZ rule of thumb 5n constraints for n unknowns)
57Errors in the image
(standard case)
Errors in the world
Errors in the image and in the world
58Radial distortion
- Due to spherical lenses (cheap)
- Model
R
R
barrel dist.
pincushion dist.
straight lines are not straight anymore
http//foto.hut.fi/opetus/260/luennot/11/atkinson_
6-11_radial_distortion_zoom_lenses.jpg
59Radial distortion example
60Some typical calibration algorithms
Tsai calibration
Reg Willsons implementation http//www-2.cs.cm
u.edu/rgw/TsaiCode.html
Zhangs calibration
Z. Zhang. A flexible new technique for camera
calibration. IEEE Transactions on Pattern
Analysis and Machine Intelligence,
22(11)1330-1334, 2000. Z. Zhang. Flexible Camera
Calibration By Viewing a Plane From Unknown
Orientations. International Conference on
Computer Vision (ICCV'99), Corfu, Greece, pages
666-673, September 1999.
http//research.microsoft.com/zhang/calib/
Jean-Yves Bouguets matlab implementation http//
www.vision.caltech.edu/bouguetj/calib_doc/
61Recovery of world position
- Given u,v we cannot uniquely determine the
position of the point in the world. - Each observed image point (u,v) gives us two
equations in three unknowns (X,Y,Z). These
equations define a line (i.e, ray) in space, on
which the world point must lie. - For general 3D scene interpretation, we need to
use more than one view. Later in this course we
will take a detailed look at stereo vision and
structure from motion.