Title: Structure%20and%20Motion%20from%20Line%20Segments%20in%20Multiple%20Images
1Structure and Motion from Line Segments in
Multiple Images
- Camillo J. Taylor, David J. Kriegman
Presented by David Lariviere
2Primary Goal
- Given a series of images with known corresponding
line segments, calculate the relative locations
of the cameras imaging the scene and the
three-dimensional locations of the line segments.
3Some Previous Work
- (1981) Longuet-Higgins. A computer algorithm for
reconstructing a scene from two projections. - (1990) Vieville. Estimation of 3D-motion and
structure from tracking 2D-lines in a sequence of
images. - (1992) Tomasi, Kanade. Shape and motion from
image streams under orthography.
4Problem Characterization
- Instead of using generalized scenes and points,
focus on rigid scenes with clear edges as
features. - Advantages of lines as features
- Occur frequently in man-made environments.
- Easily located and tracked
- More accurately localized than points because
there is more information available in
corroboration.
5Algorithm Overview
- Determine a non-linear objective function whose
minimization leads to an estimate of scene
structure. - In this case, estimate 3D camera
locations/orientations and locations of line
segments in 3D, and then reproject the lines onto
the estimated image planes. - The difference between the predicted projected
lines and the actually observed lines is the
error function to minimize.
6Objective Function
- pi ith 3D line
- qj jth camera position/orientation
- uij observed edge i in image j.
- m images
- n lines
- F reprojection of line pi onto the image plane
of camera qj.
7Notation Line Representation
- Represent a line in 3D space by (v,d)
- v unit vector pointing in direction of the line
- d vector from origin to closest point on the
line. - m normal vector of the plane defined by the
camera center and line. - Edge in image plane defined by mxx myy mz 0
8Notation Reference Frames
- Relate location/orientation of each camera to
some world base frame.
9Summary of Parameters
- Camera Location (tj) 3 DOF
- Camera Orientation (Rj) 3 DOF
- Line Location/Orientation (v,d) 4 DOF
- Requires at least 6 edge correspondences in 3
images.
10Reprojection Error
- Visible endpoints (x1,y1) (x2,y2)
- Calculate minimal distance between observed and
predicted lines for every point integrated on
interval between endpoints. - Normalize error by dividing by length of observed
edge.
11Algorithm
- Primary Algorithm for minimizing non-linear
function minimize line reprojection error
through gradient decent to find local minimum - Randomly generate initial values.
- Iteratively follow function along steepest
descent to reach local minimum. - If local minimum error is below a certain
threshold, accept. - Else, generate new initial values and try again.
- Quality of initial values influence heavily the
number of iterations required before the function
converges.
12Initial Value Estimation
- In order to decrease computational cost,
additional steps are added to acquire acceptable
starting values for gradient decent - User inputs range for camera orientations (Rj)
and values of Rj within that range are randomly
chosen. - Holding constant estimates from (1), estimate vi
subject to a constraint equation. - Improve estimate from (2) by now minimizing same
constraint equation with both vi and Rj as free
parameters. - Generate initial estimates of di and tj, using a
second constraint equation. - Provide estimates from (3) and (4) as starting
values for gradient decent.
13Constraint Equations
- From the defined relations
- One can derive
- Which provides two constraint equations
14Results
- Simulation Results
- measuring tolerance to noise, rate of returns due
to increased number of images/features, and rate
of convergence of global minimization. - Comparing proposed method to previous linear
methods - Real-world Results
15Simulation Results
- Main Results
- The algorithm is much more sensitive to errors in
edge endpoints than error in the calibrated
camera center. - Holding maximum baseline constant, increasing the
number of images beyond 6 or the number of lines
beyond 50 does not improve accuracy. - Small number of large-baseline images superior to
many small-baseline images. - Rate of convergence of global decent minimization
algorithm is highly dependant on initial range of
theta.
16Simulation Results Continued
17Comparison to Linear Method
- This method is significantly less sensitive to
noise than the leading linear algorithm1
1J. Weng, Y. Liu, T. S. Huang, and N. Ahuja,
Estimating motion/structure from line
correspondences
18Real-world Results
19Real-world Results
20Real-world Results Hallway
21Discussion
- Initial estimation optimizations improve
calculation speed. - Algorithm is very insensitive to noise
- Future improvements
- Automate edge correspondence tracking by using
video. - Impose edge-intersection and other geometric
restrictions (coplanarity, parallelism, etc).
22Modeling and Rendering Architecture from
Photographs A hybrid geometry- and image-based
approach
- Paul E. Debevec, Camillo J. Taylor, Jitendra Malik
23Overview
- Apply previous papers methods to modeling
architectural scenes with restricted geometry. - Utilize model-based stereo to extract precise
geometry from a sparse set of large-baseline
photographs. - Utilize 3D models and view-dependant photographs
to construct photorealistic computer-generated
views.
24Architectural Models Blocks
- User starts by choosing geometric primitives
(blocks) to represent the basic geometry of the
building - Block hierarchical model of a parametric
polyhedral primitive - Parametrized by base vertex and Po and other
various properties (width, height, length, etc).
25Block Relations
- Hierarchy of blocks are used to describe the
various geometric primitives that make up the
basic architecture. - User manually maps corresponding edges in images
to the edges of the blocks. - Blocks are related by constraints on their
relations in terms of location and orientation - For example, ensure that the bottom of one block
sits on top of the top of another block. - Values of blocks are stored symbolically, meaning
if one specifies a series of blocks to be
parallel, then only one variable is used to
enforce this restriction across all blocks. - gi(X) rigid transformation mapping one block to
adjacent block. - Pw(x) block vertex in world coordinates
- vw(x) line orientation in world orientation
26Block Relations Continued
27Advantages of Blocks
- Well model most architectural scenes
- Implicitly contain features commonly found in
architecture (ex parallel edges, right angles) - Manipulation by user is easier due to reduced
number of parameters. - Surfaces are pre-defined by the model, removing
the need to calculate them from edges. - Number of parameters are greatly reduced when
performing minimization of cost function.
28Single Image Examples
29Estimation of 3D Structure
- Very similar to previous paper Estimate
parameters of camera (R, t) and edges (v, d)
which minimize the reprojection error. - Differences
- Many edges are defined with relation to one
another, meaning fewer variables. - Apply horizontal/vertical constraints on vi to
more accurately estimate Rj. - Instead of using gradient decent, the authors use
Newton-Raphson method to minimize the non-linear
error function.
30View-Dependant Texture Mapping
- Once camera and edge locations/orientations are
known, project images onto block models. - If multiple images of same area exist, apply
weighted averaging to fuse multiple images. - Weights are inversely proportional to the
difference in angle between the virtual view
being synthesized and the camera
location/orientation which took the particular
image. - Possible to divide planes into faces, and only
calculate the weighted average for one value and
apply it to the entire face.
31Example of Texture-Mapping
32Model-based Stereopsis
- Use known scene geometry and camera locations to
rectify large-baseline images before performing
stereo. - Allows for the avoidance of foreshore-shortening
problems which can be very large when images are
taken far apart. - Maintain epipolar constraint by projecting offset
image onto model and then reprojecting onto key
images image plane to create rectified image for
use in stereopsis.
33Model-based Stereopsis Example
34Discussion
- For architectural scenes that generally fit the
allowed geometric primitives, approach works
quite well. - Future Possible Improvements
- Additional models surfaces of revolution
- Estimate BRDF
- Devise method of selecting best images to use for
rendering of novel views.
35Questions?