Estimating 3D Body Pose using Uncalibrated Cameras - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Estimating 3D Body Pose using Uncalibrated Cameras

Description:

Very hard problem for large baselines. Motivation. Rosales ... [Wren et. al. 97] [Gavrila, Davis 96] [Ju et. al. 96] [Hogg 83] [Baumberg, Hogg 93,94] ... – PowerPoint PPT presentation

Number of Views:89

Avg rating:3.0/5.0

Slides: 42

Provided by: ValuedGate168

Category:

more less

Transcript and Presenter's Notes

Title: Estimating 3D Body Pose using Uncalibrated Cameras

1
Estimating 3D Body Pose using Uncalibrated Cameras
Rómer Rosales, Matheen Siddiqui, Joni Alon, Stan
Sclaroff Image and Video Computing Group Computer
Science Department Boston University
Now at University of Toronto
2
Problem Definition
1
2
3
3D Body Pose
Uncalibrated cameras
3
Motivation

Calibrated setups
Rarely remain calibrated for extended periods
Tend to be expensive
Cameras may be mobile
Previous approaches
Need feature correspondence
Very hard problem for large baselines

4
Background Pose Estimation Via Tracking

Tracking match image to articulated model at
each frame
Limitations
Manual Initialization
Heavy dependency on previous estimate
Non-linear optimization at each step
Projective ambiguities
Probabilistic, learned motion models create too
strong a prior.

Example Tracking Systems
RehgMorris98 Bregler98
5
Step 1 From Visual Features to 2D Pose
Hypotheses
Hypotheses are generated independently for each
camera view using the Specialized Mappings
Architecture (SMA)
6
What Do the Hypotheses Look Like?
7
Specialized Mappings Architecture (SMA)
One-to-Many relationship
Forward problem is Easy
c
b
b
a
Poses (Output Space)
x
x
Observations (Input Space)
Forward problem (a.k.a. forward kinematics)
Inverse problem
Single views are used to infer 2D poses using SMA
NIPS 2001
8
Why Virtual Cameras?
Scale
Translation
Rotation

SMA generates pose hypotheses using Hu moments,
which are invariant to image-plane translation,
scale and rotation.
Thus, cameras recovered are not the real cameras
but rather virtual

9
Step 2 From 2D Pose Hypotheses to 3D Pose and
Cameras
E-step
All hypotheses from all cameras
Assign importance to hypotheses. from cams.
P(yH,X,W)
X
Est. pose
H
M-step
Sk
W
w1
Generalized SFM (update X,W)
Hypothesis covariances
Est. camera parameters
w2
EM finds most consistent representation from
hypotheses
w3
10
Proposed ML Approach

Camera pose hypotheses are independent given
the pose X and cameras W
Introduce map label (latent) variables y
Intractable, but can solve approximately via EM.

H
11
EM Algorithm for Estimating 3D Body Pose from
Virtual Cameras E-Step

E-Step
We use
where R is a computer graphics rendering function.

12
EM Algorithm for Estimating 3D Body Pose from
Virtual Cameras M-Step

M-Step
Leads to a generalized structure-from-motion
problem
Alternate minimization between 3D pose and
cameras.
Initialize 3D pose using factorization.

13
Multiple-View SMA Experiments(3 cameras)
Inputs (First camera)
Estimates (From the first camera viewpoint)
Example 1
Example 2
14
Example 3D Reconstruction (Synthetic inputs, not
in training set)
Left Estimate Right Ground-Truth
15
Quantitative Evaluation
(a)
(b)
(a) Optimal angular displacement between 3
cameras is 45 deg. (b) Significant increase in
accuracy between 2 and 3 views.
16
Summary
Summary

Multiple-view geometry and probabilistic learning
in a single framework.
Probabilistic estimation of 3D articulated body
pose and virtual cameras parameters.
No need for feature tracking (edges, corners,
etc.).
Camera calibration is not required.
Wider camera baseline allows better
reconstruction (usually a problem for structure
from motion).
Estimation of only a single angular displacement
parameter per virtual camera (instead of the
usual 11) improves quality of 3D body pose
estimates.

17
Future Directions

Probabilistic dependence of time (dynamical
systems)
Feature selection via sampling
Adapting model to shape variations (transfer)
Specific morphology of subject
Generative model for detection segmentation
Self-calibrating environments with mobile cameras

18
(No Transcript)
19
Abstract
20
Key Practical Issues

Sufficient data to account for all configurations
given the model
Number of specialized functions
Differences in cue data (training vs testing)
Discriminative power of visual features
What mapping function form?

21
Improving EM by knowledge of Inverse Map

Redefine posterior, with softmax
Why is this good?. More accurate estimate of
No longer have to assume form of
Comparison is done in input space

22
The Camera Matrix P
Homogeneous
In our special case
23
Whats so special about our special case

Recall feature invariances rotation,
translation, and scaling.
SMA 2D reconstruction inherits same invariances
(/-)
Camera model (implicit) for reconstruction is
different from virtual camera model that captured
sequences
Due to invariances, camera parameters are known,
up to a single angular displacement.

24
Computer Vision Related Work Taxonomy Input and
output space
Y
Perona et. al. 00
Sidenbladh et. al. 00
Bregler, Malik 98
Barron, Kakadiaris 00
Gavrila, Davis 96
3d joints
Brand 99
Kakadiaris, Metaxas 96
Pentland, Horowitz 91
Rohr 93
Howe, et.al 99
Goncalves et.al 95
Rehg. Morris 95,97,98
2d joints
Yamamoto et,al. 91
Cham, Rehg 99
Ju et. al. 96
Davis et.al.99
Wren et. al. 97
Body parts
Fujiyoshi Lipton 99
Polana, Nelson 96,97
Black, Yacoob 97
Low level motion model (e.g actions)
Rosales,Sclaroff 98,99
Black 99
Blake et.at 95
Hogg 83 Baumberg, Hogg 93,94
Davis, Bobick 97
Isard, Black 98
Contours
Silhouette
X
Silhouette Features
2d joints
Flow field
Region based
Contours
25
Geometric Constraints
N number of joints 20 C number of cameras
4
minimize
2D location Estimate
2D location From SMA inference
Know yij and Pi. Solve for Xj
26
Summary

Specialized Mappings Architecture non-linear
supervised learning.
One-to-many relationships
Use of inverse map
Learning and Inference algorithms for SMA
Application to pose estimation (body-hands) from
single images
Tracking not needed, known disadvantages are
avoided (initialization, iterative methods,
sensitivity to quality of all frames, occlusion
handling, multiple viewpoints)
Certain visual ambiguities can be solved by
statistical modeling
Extension of SMA to multiple sources of
hypotheses
Non iterative!! Depends almost exclusively on
speed of function evaluation O(MN).

27
Summary

Specialized Mappings Architecture for non-linear
supervised learning. Alternative for the more
general problem of multiple function
approximation from data
Certain visual ambiguities can be solved, even on
single images, by statistical modeling
Application to pose estimation (body-hands) from
single images
Tracking not needed, known disadvantages are
avoided (initialization, iterative methods,
sensitivity to quality of all frames, occlusion
handling, multiple viewpoints)
Structure from point correspondence
Given segmentation linear inference time O(MN).
Non iterative!! Depends almost exclusively on
speed of function evaluation

(http\\www.cs.bu.edu/groups/ivc)
28
Extension Multiple-View SMA

Automatic camera positioning (mobile cameras)
Place cameras
Watch articulated object
Estimate relative positions
Cameras can work jointly to determine 3D
structure
Previous approaches
Need feature correspondence
Too difficult for large baselines

29
Specialized Mappings Architecture (Simultaneous
Estimation of Partitions and Maps)
30
3D Body Pose through Virtual Cameras (Overview)
frame
3
2
camera
1
1
2
3
3D joints
4
31
SMA Model(.vs. usual tracking approach)
Pose
Pose
Visual Features
Tracking for Pose Estimation
32
Problem Definition

Given

Determine Camera locations and 3D structure

33
Problem Formulation(Probabilistic model
dependencies)
2D Pose hypotheses
H1
H2
H
HC

X
W
Camera params
3D Pose
Hypotheses labels
34
Problem Formulation II Estimation of 3D Pose X
and Camera Relative Locations W

Proposed ML solution
Set of camera pose hypotheses are independent
given the pose X and cameras W
Let us introduce latent variable y
Intractable, but can solve approximately using
alternating minimizations

35
Proposed ML Solution
3D Pose
Camera params
2D Pose hypotheses
Hypotheses Independence
36
(No Transcript)
37
Quantitative Evaluation I
38
Multiple-View Approach Step 2 From 2D Pose
Hypotheses to 3D Body Pose and Cameras
X
All hypotheses from all cameras
Est. pose
ML Estimation (EM)
H
Sk
W
w1
Hypothesis covariances
Est. camera parameters
w2
w3
39
Motivation

Automatic camera positioning (mobile cameras)
Place cameras
Watch articulated object
Estimate relative positions
Cameras can work jointly to determine 3D
structure
Previous approaches
Need feature correspondence
Very hard problem for large baselines

40
Multiple-View SMA Experiments(3 cameras)
Inputs (First camera)
Estimates (From the first camera viewpoint)
41
Multiple-View SMA Experiments
Inputs (First camera)
Estimates (From the first camera viewpoint)

Write a Comment

User Comments (0)