Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking - PowerPoint PPT Presentation

1 / 74

About This Presentation

Title:

Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking

Description:

Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking – PowerPoint PPT presentation

Number of Views:117

Avg rating:3.0/5.0

Slides: 75

Provided by: leonid2

Category:

more less

Transcript and Presenter's Notes

Title: Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking

1
Continuous-state Graphical Models for Object
Localization, Pose Estimation and Tracking
Leonid Sigal
Department of Computer Science, Brown University
http//www.cs.brown.edu/people/ls/
2
Big picture

Computer Vision Build tools that allow computers
to reason about the world based on visual inputs
(building computational models of the world)
Building models of objects that allow us to
reason about the position, configuration and
interactions between these objects (e.g. cars,
buildings, people)

3
Pose Estimation
Find the pose of the body
such that the model explains the image data.
4
Tracking
Time

Estimate pose for every time instance
Tracking can simply be pose estimation at every
frame, but its very costly and ambiguous
Using temporal information is useful in
localizing the body
resolving pose ambiguities

5
Applications

Navigation Automated vehicle navigation,
obstacle avoidance, robotics
Human Computer Interaction Smart homes

Entertainment Animation, Games
Clinical Rehabilitation medicine
Security Surveillance
Understanding Gesture/Activity recognition

Matrix Trilogy, Warner Bros. Studios
Smart Rooms Project, MIT
SmartTer Project, EPFL
6
Dont believe me?
Within five years "you could use gesture
recognition to get rid of the remote control,"
body trackingthe whole body, not just the
handscould drive demand for Intels important
new generation of semiconductors
Justin Rattner, Intel CEO
7
Why is it hard?

Appearance/size/shape of people can vary

Occlusions
High dimensionality
Loss of depth information
Loose clothing
Motion blur .

8
Approach

Break up a very hard problem into smaller
manageable pieces

Use continuous-state graphical models to model
the person

Statistical Graphical Models 1,220,000
Sigal, Black, AMDO06
Fashion Models 27,600,000
9
Contributions

Define a new and very rich model for modeling
people and reasoning about their
pose (loose-limbed body model)

Introduce hierarchical strategy for 3D reasoning

10
Related WorkGenerative Approaches

Local stochastic search (top-down)
Part-based approaches (bottom-up)

Felzenszwalb Huttenlocher, 00
Ramanan, Forsyth, Zisserman, 05
11
Loose-limbed Body Model Preview

12
Graphical Model (Toy Example)

Encode Conditional Independence between random
variables

0.4 0.3 0.0 0.1 0.2
13
Inference in a Graphical Model

Finding the most likely values for all unknown
variables (X1, X2, X3)
Brute force algorithm
by Hammersley-Clifford Theorem

p(X1bucket, X2bucket, X3bucket) p(X1bucket,
X2bucket, X3tree trunk) p(X1face, X2thigh,
X3calf) p(X1calf, X2calf, X3calf)
Prior
Likelihood
14
Belief Propagation

What is BP?
Efficient algorithm for doing inference in
graphical models
For example, on a tree
Brute force algorithm 0(MN)
BP is O(NM2)
In this simple example
Brute force 53125
BP 3x5275

Real life 1028 times faster

15
Step 1 of 2 Message Propagation

X1?X2
I am sure I am a face, so I think you should be
some other part of the body

P(X2bucket) 0.0 P(X2tree trunk)
0.0 P(X2face) 0.0 P(X2thigh)
0.6 P(X2calf) 0.4
16
Step 2 of 2 Belief Estimation

Merge
Information from neighbors
Local Information
Distribution (belief) over X2
Most likely value for X2

17
Loopy-graphical Models

Belief Propagation can be used to get an
approximate solution
Exact solution is intractable
Inference on a graph
BP is O(NMC)

X1
I1
X4
I4
X2
X5
X3
I2
I5
I3
18
Generative Approaches

Local stochastic search (top-down)
Part-based approaches (bottom-up)

Felzenszwalb Huttenlocher, 00
Ramanan, Forsyth, Zisserman, 05
19
Tree-structured Body Model
? s1, s2, , sM
X1
X , , ,
X1
I
Kinematic
X10
X2
X8
X4
X6
X2
X10
I
I
I
I
I
X3
X5
X9
X7
I
I
I
I
Felzenszwalb Huttenlocher, 00
20
State-space
Felzenszwalb Huttenlocher, 00
21
State-space
Felzenszwalb Huttenlocher, 00
22
State-space
This example 13 x 10 positions x 8
rotations x 5 scales 5,200
Felzenszwalb Huttenlocher, 00
23
State-space
Real Case 100 x 100 positions x 20
rotations x 5 scales 1,000,000
Felzenszwalb Huttenlocher, 00
24
Tree-structured Body Model
? s1, s2, , sM
X1
X , , ,
X1
X2
X10
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Felzenszwalb Huttenlocher, 00
25
Inference in Tree-structured Model
Prior
Likelihood
Xi
Xj

Prior
Body parts are connected at joints
Relative positions of Xi and Xj

26
Inference in Tree-structured Model
Prior
Likelihood

27
Inference in Tree-structured Model
Prior
Likelihood

28
Inference in Tree-structured Model
Prior
Likelihood

Color
Edges
29
Inference in Tree-structured Model
Prior
Likelihood
Inference in this model can be done using
standard Belief Propagation (BP)
(exact inference)
Message Passing
Belief
30
Tree-structured model limitations
Prior
Likelihood

To make the inference tractable dynamic
programming is used
Only works for a tree structured model
Requires a relatively coarse discretization
Requires very simple form of the prior
O(M2 N) ? O(M N)

31
Comparison
32
Loose-limbed Body Model
? s1, s2, , sM
X1
X , , ,
X1
X2
X10
?R5
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Sigal, Isard, Sigelman, Black, NIPS03
33
Inference in Loose-Limbed Model
Prior
Likelihood
34
Inference in Loose-Limbed Model
Prior
Likelihood
Message Passing
Belief
35
Inference in Loose-Limbed Model
Prior
Likelihood
Inference in this model can be done using
standard Belief Propagation (BP)
Integration cannot be done analytically
Message Passing
Belief
36
Inference in Loose-Limbed Model

In tree-structured graphical models, exact
inference can be computed using BP
But, not in our case, where
Variables are continuous
Likelihoods (or priors) are not Gaussian
Graph contains loops
This forces the use of approximate BP inference
algorithms
PAMPAS M. Isard, 03
Non-Parametric BP E. Sudderth, A. Ihler, W.
Freeman, A. Willsky, 03

X1
X10
X2
X4
X8
X6
X3
X5
X9
X7
37
Loose-limbed Body Model
? R5
X1
X , , ,
X1
X2
X10
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Sigal, Isard, Sigelman, Black, NIPS03
38
How good are tree structured model?

Model always prefers undesired hypothesis

39
Tree-structured approaches

Assume likelihood factors

F ( )
F1( )

x
When Parts Can Occlude Each Other
Fi ( )
40
Occlusion-sensitive Likelihoods

We introduce explicit occlusion modeling into the
likelihood function in the form of hidden
per-pixel binary variables

F( )
F1( )

x
Ensures that we can factor the likelihood even
in presence of occlusions
41
Occlusion-sensitive Likelihoods(for the torso)
Is there any part in front of torso?
Vi
F1( )
Vi
Is there any part behind torso?
42
Occlusion-sensitive model

Always prefers the true hypothesis

43
Occlusion-sensitive Loose-limbed Body Model
e R5
X1
X , , ,
X1
X2
X10
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Occlusion
Sigal, Black, CVPR06
44
2D Pose Estimation
Most Likely Sample
Most Likely Sample
Most Likely Sample
Distribution
Distribution
Frame 2
Frame 24
Frame 49
Loose-limbed (No Occlusions)
Pictorial Structures
Loose-limbed (Occlusion-sensitive)
45
2D Pose Estimation
Most Likely Sample
Most Likely Sample
Most Likely Sample
Distribution
Distribution
Frame 2
Frame 24
Frame 49
Loose-limbed (No Occlusions)
Pictorial Structures
Loose-limbed (Occlusion-sensitive)
46
2D Pose Estimation
47
Quantitative Evaluation

Synchronized marker based motion capture and
multiocular video dataset
Currently downloaded by gt60 groups around the
world

Workshops in NIPS06, CVPR07
48
Quantitative Comparison

All algorithms were run on the same data

Subject specific Motion specific
() Beyond Trees Common Factor Models for 2D
Human Pose Recovery, Lan and Huttenlocher, ICCV
2005.
49
Inferring 2D pose

Occlusion-sensitive Loose-limbed body model
allows us to infer the 2D pose reliably (at about
50 overhead)
Even when motions are complex

Moving Camera
50
Summary so far
Occlusion-sensitive Loose-limbed body model
51
Hierarchical Graphical Model Structure
3D
2D
Image
52
Hierarchical Graphical Model Structure
3D
2D
Image
53
Hierarchical Graphical Model Structure
3D
2D
Image
54
Inferring 3D pose from 2D pose

We obtain estimates for the joints automatically
We learn direct probabilistic mapping

55
Inferring 3D pose from 2D pose
Mixture of Experts (MoE)
Sminchisescu et al, 05
Waterhouse et al, 96
56
Inferring 3D pose from 2D pose
We want to estimate a distribution/mapping p(3D
Pose2D Pose)
X e Rn
2D Pose
p(YX)
Y e Rm
3D Pose
Problem p(YX) is non-linear mapping, and not
one-to-one
57
Mixture of Experts (MoE)
We want to estimate a distribution/mapping p(3D
Pose2D Pose)
X e Rn
2D Pose
p(YX)
Y e Rm
3D Pose
Solution p(YX) may be approximated by a locally
linear mappings (experts)
58
How well does MoE model work?

View only 22 mm
Pose only 59 mm
Overall 64 mm

59
Hierarchical 3D Pose Estimation from Single View
Monocular Images
Most Likely Sample
Most Likely Sample
Distribution
Distribution
Frame 10
Frame 20
Frame 50
2D Pose Estimation
3D Pose Estimation
Image
60
Hierarchical 3D Pose Estimation from Single View
Monocular Images
2D Pose Estimation
3D Pose Estimation
61
Summary so far
Hidden Markov Model (HMM)
62
Hierarchical Graphical Model Structure
3D
2D
Image
t1
t
t-1
63
Benefits of tracking
Frame 50
Frame 50
Frame 49
Frame 49
2D Pose Estimation
3D Pose Estimation Tracking
64
Other ApplicationMultiocular imagery
Link
Sigal, Bhatia, Roth, Black, Isard, CVPR04
65
Other Applications Vehicle Detection and
Tracking
Sigal, Zhu, Comaniciu, Black, IWCM04
Link
66
Contributions

Introduced loose-limbed body model
can deal with continuous-state estimation
can encode rich set of constraints (occlusions,
penetrations, action specific kinematics)
Introduced tractable inference approach for this
model
Used hierarchical representation and inference to
manage complexity of the problem
Quantitative evaluation of human pose estimation

67
Future WorkBetter inference methods

Particle Message Passing does not deal well with
multiple modes in the distribution
Mixture Tracking
Inference approaches are relatively slow
Hybrid Monte Carlo filtering

Vermaak, Doucet, Perez, CVPR03
Choo, Fleet, ICCV01
68
Future WorkLearning model structure

Learning the model structure (useful for deriving
motion specific models)
Kernel Generalized Variance

Bach, Jordan, NIPS03
Walking
Stretching
69
Future Work Deeper Hierarchical Models
3D
2D
Image
70
Future Work Deeper Hierarchical Models
3D
2D
Features
Image
71
Future Work Deeper Hierarchical Models
Scene
3D
2D
Features
Natural Language Processing Visual Grammars
F. Han and S.-C. Zhu, ICCV05
Image
72
Collaborators and Colleagues

Michael J. Black
- Alex Balan
- Stefan Roth
- Sidharth Bhatia
- Ben Sigelman

- Michael Isard

Dorin Comunicu
Ying Zhu

Horst Haussecker
Trista Chen
Konstantin Radyushkin

73
Thank you !!!
74
Contributions

Introduced loose-limbed body model
can deal with continuous-state estimation
can encode rich set of constraints (occlusions,
penetrations, action specific kinematics)
Introduced tractable inference approach for this
model
Used hierarchical representation and inference to
manage complexity of the problem
Quantitative evaluation of human pose estimation

Write a Comment

User Comments (0)