3D Scene Models - PowerPoint PPT Presentation

1 / 33

About This Presentation

Title:

3D Scene Models

Description:

Map textures to vertical planes (as in TIP) ... N superpixels in constellation. Line and intersection detectors. Not used: constellation shape (contiguous, N ... – PowerPoint PPT presentation

Number of Views:87

Avg rating:3.0/5.0

Slides: 34

Provided by: peopleC

Learn more at: http://people.csail.mit.edu

Category:

more less

Transcript and Presenter's Notes

Title: 3D Scene Models

1
3D Scene Models 6.870 Object recognition and
scene understanding Krista Ehinger
2
Questions

What makes a good 3D scene model? How accurate
does it need to be?
How far can you get with automatic surface
detection? Where do you need human input?

3
Modelling the scene

Real scenes have way too many surfaces

4
Modelling the scene

Option 1 Diorama world

5
Tour Into the Picture (TIP)?

Model the scene as 5 planes foreground objects
Easy implementation planes/objects defined by
humans

Y. Horry, K.I. Anjyo and K. Arai. "Tour Into the
Picture Using a spidery mesh user interface to
make animation from a single image". ACM SIGGRAPH
1997
6
TIP Implementation

User defines vanishing point, rear wall of the
scene (inner rectangle)?
Given some assumptions about the camera,
position/size of all planes can be computed...

Y. Horry, K.I. Anjyo and K. Arai. "Tour Into the
Picture Using a spidery mesh user interface to
make animation from a single image". ACM SIGGRAPH
1997
7
Defining the box

Define planes Floor - y0, Ceiling - yH
Given horizon (vanishing point), corners of
floor, ceiling can be computed from 2D image
position

Y. Horry, K.I. Anjyo and K. Arai. "Tour Into the
Picture Using a spidery mesh user interface to
make animation from a single image". ACM SIGGRAPH
1997
8
Defining the box

Once the positions of the planes are known,
compute the texture of the planes

Y. Horry, K.I. Anjyo and K. Arai. "Tour Into the
Picture Using a spidery mesh user interface to
make animation from a single image". ACM SIGGRAPH
1997
9
What about foreground objects?

Assume a quadrangle attached to floor, compute
attachment points, upper points
Hierarchical model of foreground objects

Y. Horry, K.I. Anjyo and K. Arai. "Tour Into the
Picture Using a spidery mesh user interface to
make animation from a single image". ACM SIGGRAPH
1997
10
Extracting foreground objects

Foreground objects removed, added to mask
Holes in background filled in using photo
completion software

Y. Horry, K.I. Anjyo and K. Arai. "Tour Into the
Picture Using a spidery mesh user interface to
make animation from a single image". ACM SIGGRAPH
1997
11
TIP Demonstration
12
TIP Discussion

Pros
Accurate model (due to human input)?
Deals with foreground objects, occlusions
Cons
Requires human input, not automatic
Model too simple for many real-world scenes

13
Modelling the scene

Option 2 Pop-up book world

14
Automatic Photo Pop-Up

Three classes of surface ground, sky, vertical
Not just a box can model more kinds of scenes
Automatic classification, no labeling

D. Hoiem, A.A. Efros, and M. Hebert, "Automatic
Photo Pop-up", ACM SIGGRAPH 2005.
15
Photo Pop-Up Implementation

Pixels - superpixels - constellations
Automatic labeling of constellations as ground,
vertical, or sky
Define angles of vertical planes (using
attachment to ground)?
Map textures to vertical planes (as in TIP)?

D. Hoiem, A.A. Efros, and M. Hebert, "Automatic
Photo Pop-up", ACM SIGGRAPH 2005.
16
Superpixels, constellations

Superpixels are neighboring pixels that have
nearly the same color (Tao et al, 2001)?
Superpixels assigned to constellations according
to how likely they are to share a label (ground,
vertical, sky) based on difference between
feature vectors

17
Feature vectors

Color features RGB, hue, saturation
Texture features Difference of oriented
Gaussians, Textons
Location (absolute and percentile)?
N superpixels in constellation
Line and intersection detectors
Not used constellation shape (contiguous, N
sides), some texture features

18
Training process

For each of 82 labeled training images
Compute superpixels, features, pairwise
likelihoods
Form a set of N constellations (N 3 to 25),
each labeled with ground truth
Compute constellation features
Compute constellation label, homogeneity
likelihood

19
Training process

Adaboost weak classifiers learn to estimate
whether superpixels have same label (based on
feature vector)?
Another set of Adaboost week classifiers learns
constellation label, homogeneity likelihood
(expressed as percent ground, vertical, sky,
mixed)?
Emphasis on classifying larger constellations

20
Building the 3D model

Along vertical/ground boundary, fit line segments
(Hough transform) goal is to find simplest
shape (fewest lines)?
Project lines up from corners of boundary lines,
cut and fold

D. Hoiem, A.A. Efros, and M. Hebert, "Automatic
Photo Pop-up", ACM SIGGRAPH 2005.
21
Photo Pop-Up Demonstration
D. Hoiem, A.A. Efros, and M. Hebert, "Automatic
Photo Pop-up", ACM SIGGRAPH 2005.
22
Photo Pop-Up Discussion

Pros
Automatic
Can handle a variety of scenes, not just boxes
Cons
No handling of foreground objects
Misclassification leads to very strange models
Only 2 kinds of surface ground, vertical

D. Hoiem, A.A. Efros, and M. Hebert, "Automatic
Photo Pop-up", ACM SIGGRAPH 2005.
23
Modelling the scene

Option 3 Actually try to model surface angles

24
3D Scene Structure from Still Image

Compute surface normal for each surface
No right-angle assumptions surfaces can have any
angle
Automatic (trained on images with known depth
maps)?

25
3D Scene Implementation

Segment image into superpixels
Estimate surface normal of each superpixel (using
Markov Random Field model)?
Optional Detect and extract foreground objects
Map textures to planes

Original image
Modeled depth map
A. Saxena, M. Sun, A. Y. Ng. "Learning 3-D Scene
Structure from a Single Still Image". In ICCV
workshop on 3D Representation for Recognition
(3dRR-07), 2007
A. Saxena, M. Sun, A. Y. Ng. "Learning 3-D Scene
Structure from a Single Still Image". In ICCV
workshop on 3D Representation for Recognition
(3dRR-07), 2007
26
Image features

Superpixel features (xi)?
Color and texture features as in Photo Pop-Up
Vector also includes features of neighboring
superpixels
Boundary features (xij)?
Color difference, texture difference, edge
detector

27
Markov Random Field Model

First term model planes in terms of image
features of superpixels
Second term model planes in terms of pairs of
superpixels, with constraints...

A. Saxena, M. Sun, A. Y. Ng. "Learning 3-D Scene
Structure from a Single Still Image". In ICCV
workshop on 3D Representation for Recognition
(3dRR-07), 2007
28
Model constraints

Connected structure except where there is an
occlusion, neighboring superpixels are likely to
be connected
Coplanar structure except where there are folds,
neighboring superpixels are likely to lie on the
same plane
Co-linearity long straight lines in the image
correspond to straight lines in 3D

29
Foreground objects

Automatically-detected foreground objects may be
removed from model (for example pedestrians,
using Dalal Triggs detector)?
Detected objects add 3D cues (pedestrians are
basically vertical, occlude other surfaces)?

30
3D Scene Demonstration
31
Results
A. Saxena, M. Sun, A. Y. Ng. "Learning 3-D Scene
Structure from a Single Still Image". In ICCV
workshop on 3D Representation for Recognition
(3dRR-07), 2007
32
3D Scene Discussion

Pros
Handles a variety of scene types
Fairly accurate (about 2/3 of scenes correct)?
Automatic
Handles foreground objects
Cons
Still fails on 1/3 of scenes

33
Discussion

Simple 3D models are adequate for many scenes
You can get pretty far without human input (but
still would be better results with human
annotation of scenes)
Extensions?
Use photo completion techniques to handle
occlusions?
Massive training sets - better 3D models?

Write a Comment

User Comments (0)