Title: Volumetric Scene Reconstruction: Past, Present, and Future
1Volumetric Scene Reconstruction Past, Present,
and Future
- A spooky presentation by
- Greg Slabaugh
- October 27, 2000
2Motivation
3D reconstruction
Input Calibrated images / video
Output Photo-consistent voxels
3Motivation
- Why is this useful?
- Generating photo-realistic 3D models is difficult
and tedious with traditional CAD-based tools - In contrast, taking photographs is simple
- Digital imaging becoming ubiquitous
- Many applications
Interactive television
Digital modeling
E-commerce
Robot navigation
Special effects
Motion tracking and capture
Visual inspection
Virtual and augmented reality
4Outline
- Taking and Calibrating Photographs
- Camera calibration
- Multi-view ray intersection
- Photo Hulls
- Standard Approaches to Reconstruction
- Reconstructing Extended Environments
- Visual Hulls
- Object-based Visual Hulls
- Image-based Visual Hulls
- Future Directions
5Taking and Calibrating Photos
- Each surface visible in at least two images
- No restriction on camera placement
- No restriction on camera orientation
6Taking and Calibrating Photos
- Calibration
- 3D to 2D Tells us how points in 3D space
project into the image plane - 2D to 3D Allows us to form rays that shoot into
3D space through a pixel location - Gives extrinsic (rotation, translation) and
intrinsic (focal length, radial lens distortion,
etc.) parameters
7Taking and Calibrating Photos
Glyphs Known 3D position
8Taking and Calibrating Photos
Glyph Extraction
3D World Coord
2D Screen Coord
Tsai/Wilson Camera Calibration
Transformation Matrix P
9Taking and Calibrating Photos
- Experience
- Planar calibration results in great calibration
on the plane, but not off the plane - It is important to calibrate over 3D space
- Problem
- Need to calibrate non-planar points, but we only
the 3D world coordinates of points on the plane - How do we find world coordinates of 3D points off
the plane? - Solution
- Multi-view Ray Intersection (Triangulation)
10Taking and Calibrating Photos
Due to correspondence and camera calibration
inaccuracy, rays do not exactly intersect in 3D
correspondence
11Taking and Calibrating Photos
Q How should this be done when you have N
cameras? A Find the point that is closest to
all N rays
12Taking and Calibrating Photos
Least Squares Problem Can be minimized
analytically, resulting in a linear solution,
which is
where
Solved by inverting a 3x3 matrix
13Taking and Calibrating Photos
Calibration procedure
Extract planar glyphs
Planar calibration
Extract non-planar features in images
Multi-view ray intersection to find 3D
Non-planar calibration
14Photo Hulls
- Description
- A photo hull is a 3D reconstruction of a scene
computed by matching colors across images. - Approaches
- Voxel Coloring
- Space Carving
- Generalized Voxel Coloring
- Multi-Hypothesis Algorithms
15Photo Hulls
Photo-consistency A voxel is said to be photo-
consistent if all cameras that can see the voxel
agree on its color
red
red
16Photo Hulls
Key idea 1 If a voxel is photo-consistent, it
is likely to be on the surface
red
red
17Photo Hulls
Key idea 2 If a voxel is not photo-consistent,
it is unlikely to be on the surface.
green
red
18Photo Hulls
- Reconstruction methodology
- Break up bounding volume into voxels
- Determine photo-consistency of visible voxels
- Remove (carve) voxels that are not
photo-consistent - Color voxels that are photo-consistent
- Loop until no more voxels can be carved
- Can process in a coarse-to-fine fashion
19Photo Hulls
Example Ghiradelli data set reconstructed with
GVC
17 Photos at 768 x 576 resolution
20Photo Hulls
Example Ghiradelli data set reconstructed with
GVC
Result of camera calibration
21Photo Hulls
Example Ghiradelli data set reconstructed with
GVC
Multi-Res Reconstruction Final Volume 168 x 104 x
256 Voxels
22Photo Hulls
Example Ghiradelli data set reconstructed with
GVC
New View Synthesis
23Photo Hulls
- Reconstructing Extended Environments
- Voxel coloring methods quite successful at
reconstructing small-scale scenes. What about
large-scale scenes? - Application to large-scale scenes challenging
- Large reconstruction volume - many voxels
- Far away surfaces do not require high resolution
- Ideally, spatially adaptive voxel size
Only surfaces located within the reconstruction
volume are reconstructed
24Photo Hulls
Goal To represent an infinite (or
semi-infinite) volume with a finite number of
voxels. Accomplished by warping the voxel space.
Exterior space
Interior space
Pre-warped voxel space
25Photo Hulls
- Warping must satisfy the criteria that
- No voxels overlap
- No gaps exist between voxels
3D lattice topology
Warped voxel space
26Photo Hulls
- Stanford scene
- 10 panoramic (360 degree FOV) photographic
images - Resolution 2502 x 884 pixels
- Reconstructed in a 300 x 300 x 200 voxel space
- Inner 200 x 200 x 100 voxels in interior space
- Reconstructed using modified GVC
Source image
27Photo Hulls
- Stanford scene
- 10 panoramic (360 degree FOV) photographic
images - Resolution 2502 x 884 pixels
- Reconstructed in a 300 x 300 x 200 voxel space
- Inner 200 x 200 x 100 voxels in interior space
- Reconstructed using modified GVC
Reconstruction
28Visual Hulls
- Description
- A visual hull is a 3D reconstruction of a scene
computed using binary images. The visual hull
contains (is a superset of) the photo hull. - Consequences of throwing away color information
- Cannot reconstruct all concave surfaces
- Less accurate 3D model
- Simpler (and faster) reconstruction
- Approaches
- Object-based Visual Hulls
- Image-based Visual Hulls
29Visual Hulls
- Szeliski 93
- Multi-resolution approach
30Visual Hulls
- Szeliski 93
- Multi-resolution approach
31Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved
32Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved
33Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved
34Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved
35Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved
36Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved
37Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain
38Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
39Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
40Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
41Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
42Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
43Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
44Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
45Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
46Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
Resolution 1 complete
47Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
Resolution 2 complete
48Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
Resolution 3 complete
49Visual Hulls
- Szeliski 93
- Multi-resolution approach
- Voxels that project to only background pixels
in any image are carved - Voxels that project to only foreground pixels
in all images remain - Ambiguous voxels are subdivided
No subdivision at final resolution Ambiguous
voxels remain in voxel space at final resolution
Resolution 4 complete
50Visual Hulls
- Image-based Visual Hulls
- Most of the work occurs in 2D instead of 3D
- System developed at MIT, published in SIGGRAPH
2000 - Capable of realtime (15 fps) reconstruction of
time-varying scenes
51Future
- Future Directions
- Reconstructing more of the plenoptic function, a
7D function that completely characterizes the
flow of light at all points in space, in all
directions, at all time, and at all wavelengths. - Reconstructing time-varying scenes
- Reconstructing non-Lambertian scenes
- Reconstructing BRDF models, light sources
- New applications Surface tracking, 4D video
coding (3D reconstruction time), 3D ATR - Further automation Self-calibration
- Better camera placement (3D sampling)
methodologies - Different imaging modalities
52Future
53Future
System Diagram
Server
100 Mbps Ethernet
Genlocked Firewire Cameras 400 Mbps,
DMA (640x480x30x422)
Internet 2
54Future
- How can you use such a system??
- The end
55References
Seitz and Dyer, Photo-realistic Scene
Reconstruction by Voxel Coloring, IJCV, Feb.
1999. Kutulakos and Seitz, A Theory of Shape by
Space Carving, Proc. ICCV, 1999. Culbertson,
Malzbender, and Slabaugh, Generalized Voxel
Coloring, LNCS Vision Algorithms, Theory and
Practice, 1999. Eisert, Steinbach, and Girod,
Multi-Hypothesis, Volumetric Reconstruction of
3-D Objects From Multiple Calibrated Camera
Views, Proc. ICASSP, 1999. Slabaugh, Culbertson,
Malzbender, Volumetric Warping for Voxel
Coloring on an Infinite Domain, Proc. SMILE,
2000. Szeliski, Rapid Octree Construction from
Image Sequences, CVGIP, July 1993. Matusik,
Buehler, Raskar, McMillan, and Gortler,
Image-based Visual Hulls, Proc. SIGGRAPH, 2000.
56Taking and Calibrating Photos
Camera parameters
- Extrinsic
- Location of camera center, T
- Orientation of camera, R
- Intrinsic
- Focal length, f
- Size of a pixel, each dimension, sx sy
- Principle point, Cx Cy
- Skew, ?
- Radial lens distortion, ?1 ?2
Camera calibration methods determine these
parameters
57Taking and Calibrating Photos
- World to camera coordinates (extrinsic matrix)
- Camera to image coordinates (intrinsic matrix)
- Perspective divide
- Account for radial lens distortion
58Taking and Calibrating Photos
- Occurs from inexpensive optics
- If ? is positive, corners warped towards center
of image - Rule of thumb About 5 pixels of distortion in
a 500 x 500 image
59Taking and Calibrating Photos
- Calibration methods are of the following mold
- Identify the camera parameters to be calibrated
- Write out equations describing the projection
of 3D world - points into 2D image space using the camera
parameters - Identify several 2D / 3D matches
- Solve for parameters
- Tsai 87
- Extrinsic R, the rotation matrix and T, the
camera center location - Intrinsic f, sx, sy, Cx, Cy ,?1
60Taking and Calibrating Photos
Tsai 87
where
61Taking and Calibrating Photos
Solve system using SVD. Matrix A has rank 7.
Solution is eigenvector corresponding to the
smallest eigenvalue. Use this solution to
find remaining camera parameters.
62Taking and Calibrating Photos
63Photo Hulls
Algorithm finds a surface consistent with the
input images, but not necessarily the actual
surface
Cusping artifact
64Photo Hulls
Floating voxel artifact
65Visual Hulls
A visual hull is a geometric shape obtained using
silhouettes of an object as seen from a number of
views. Each silhouette is extruded creating
cone-like volume that bounds the extent of the
object. The intersection of these volumes results
in a visual hull. As we add more and more views,
the intersection approximates the object better
and better. The visual hull, even with an
infinite number of views, might not be same as
the object. For example, the visual hull does not
capture concavities, like the hole in the teapot
spout.