Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking - PowerPoint PPT Presentation

1 / 74
About This Presentation
Title:

Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking

Description:

Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 75
Provided by: leonid2
Category:

less

Transcript and Presenter's Notes

Title: Continuousstate Graphical Models for Object Localization, Pose Estimation and Tracking


1
Continuous-state Graphical Models for Object
Localization, Pose Estimation and Tracking
Leonid Sigal
Department of Computer Science, Brown University
http//www.cs.brown.edu/people/ls/
2
Big picture
  • Computer Vision Build tools that allow computers
    to reason about the world based on visual inputs
  • (building computational models of the world)
  • Building models of objects that allow us to
    reason about the position, configuration and
    interactions between these objects (e.g. cars,
    buildings, people)

3
Pose Estimation
Find the pose of the body
such that the model explains the image data.
4
Tracking
Time
  • Estimate pose for every time instance
  • Tracking can simply be pose estimation at every
    frame, but its very costly and ambiguous
  • Using temporal information is useful in
  • localizing the body
  • resolving pose ambiguities

5
Applications
  • Navigation Automated vehicle navigation,
    obstacle avoidance, robotics
  • Human Computer Interaction Smart homes
  • Entertainment Animation, Games
  • Clinical Rehabilitation medicine
  • Security Surveillance
  • Understanding Gesture/Activity recognition

Matrix Trilogy, Warner Bros. Studios
Smart Rooms Project, MIT
SmartTer Project, EPFL
6
Dont believe me?
Within five years "you could use gesture
recognition to get rid of the remote control,"
body trackingthe whole body, not just the
handscould drive demand for Intels important
new generation of semiconductors
Justin Rattner, Intel CEO
7
Why is it hard?
  • Appearance/size/shape of people can vary
  • Occlusions
  • High dimensionality
  • Loss of depth information
  • Loose clothing
  • Motion blur .

8
Approach
  • Break up a very hard problem into smaller
    manageable pieces
  • Use continuous-state graphical models to model
    the person

Statistical Graphical Models 1,220,000
Sigal, Black, AMDO06
Fashion Models 27,600,000
9
Contributions
  • Define a new and very rich model for modeling
    people and reasoning about their
    pose (loose-limbed body model)
  • Introduce hierarchical strategy for 3D reasoning

10
Related WorkGenerative Approaches
  • Local stochastic search (top-down)
  • Part-based approaches (bottom-up)

Felzenszwalb Huttenlocher, 00
Ramanan, Forsyth, Zisserman, 05
11
Loose-limbed Body Model Preview

12
Graphical Model (Toy Example)
  • Encode Conditional Independence between random
    variables

0.4 0.3 0.0 0.1 0.2
13
Inference in a Graphical Model
  • Finding the most likely values for all unknown
    variables (X1, X2, X3)
  • Brute force algorithm
  • by Hammersley-Clifford Theorem

p(X1bucket, X2bucket, X3bucket) p(X1bucket,
X2bucket, X3tree trunk) p(X1face, X2thigh,
X3calf) p(X1calf, X2calf, X3calf)
Prior
Likelihood
14
Belief Propagation
  • What is BP?
  • Efficient algorithm for doing inference in
    graphical models
  • For example, on a tree
  • Brute force algorithm 0(MN)
  • BP is O(NM2)
  • In this simple example
  • Brute force 53125
  • BP 3x5275
  • Real life 1028 times faster

15
Step 1 of 2 Message Propagation
  • X1?X2
  • I am sure I am a face, so I think you should be
    some other part of the body

P(X2bucket) 0.0 P(X2tree trunk)
0.0 P(X2face) 0.0 P(X2thigh)
0.6 P(X2calf) 0.4
16
Step 2 of 2 Belief Estimation
  • Merge
  • Information from neighbors
  • Local Information
  • Distribution (belief) over X2
  • Most likely value for X2

17
Loopy-graphical Models
  • Belief Propagation can be used to get an
    approximate solution
  • Exact solution is intractable
  • Inference on a graph
  • BP is O(NMC)

X1
I1
X4
I4
X2
X5
X3
I2
I5
I3
18
Generative Approaches
  • Local stochastic search (top-down)
  • Part-based approaches (bottom-up)

Felzenszwalb Huttenlocher, 00
Ramanan, Forsyth, Zisserman, 05
19
Tree-structured Body Model
? s1, s2, , sM
X1
X , , ,
X1
I
Kinematic
X10
X2
X8
X4
X6
X2
X10
I
I
I
I
I
X3
X5
X9
X7
I
I
I
I
Felzenszwalb Huttenlocher, 00
20
State-space
Felzenszwalb Huttenlocher, 00
21
State-space
Felzenszwalb Huttenlocher, 00
22
State-space
This example 13 x 10 positions x 8
rotations x 5 scales 5,200
Felzenszwalb Huttenlocher, 00
23
State-space
Real Case 100 x 100 positions x 20
rotations x 5 scales 1,000,000
Felzenszwalb Huttenlocher, 00
24
Tree-structured Body Model
? s1, s2, , sM
X1
X , , ,
X1
X2
X10
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Felzenszwalb Huttenlocher, 00
25
Inference in Tree-structured Model
Prior
Likelihood
Xi
Xj
  • Prior
  • Body parts are connected at joints
  • Relative positions of Xi and Xj

26
Inference in Tree-structured Model
Prior
Likelihood

27
Inference in Tree-structured Model
Prior
Likelihood

28
Inference in Tree-structured Model
Prior
Likelihood

Color
Edges
29
Inference in Tree-structured Model
Prior
Likelihood
Inference in this model can be done using
standard Belief Propagation (BP)
(exact inference)
Message Passing
Belief
30
Tree-structured model limitations
Prior
Likelihood
  • To make the inference tractable dynamic
    programming is used
  • Only works for a tree structured model
  • Requires a relatively coarse discretization
  • Requires very simple form of the prior
  • O(M2 N) ? O(M N)

31
Comparison
32
Loose-limbed Body Model
? s1, s2, , sM
X1
X , , ,
X1
X2
X10
?R5
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Sigal, Isard, Sigelman, Black, NIPS03
33
Inference in Loose-Limbed Model
Prior
Likelihood
34
Inference in Loose-Limbed Model
Prior
Likelihood
Message Passing
Belief
35
Inference in Loose-Limbed Model
Prior
Likelihood
Inference in this model can be done using
standard Belief Propagation (BP)
Integration cannot be done analytically
Message Passing
Belief
36
Inference in Loose-Limbed Model
  • In tree-structured graphical models, exact
    inference can be computed using BP
  • But, not in our case, where
  • Variables are continuous
  • Likelihoods (or priors) are not Gaussian
  • Graph contains loops
  • This forces the use of approximate BP inference
    algorithms
  • PAMPAS M. Isard, 03
  • Non-Parametric BP E. Sudderth, A. Ihler, W.
    Freeman, A. Willsky, 03

X1
X10
X2
X4
X8
X6
X3
X5
X9
X7
37
Loose-limbed Body Model
? R5
X1
X , , ,
X1
X2
X10
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Sigal, Isard, Sigelman, Black, NIPS03
38
How good are tree structured model?
  • Model always prefers undesired hypothesis

39
Tree-structured approaches
  • Assume likelihood factors

F ( )
F1( )

x
When Parts Can Occlude Each Other
Fi ( )
40
Occlusion-sensitive Likelihoods
  • We introduce explicit occlusion modeling into the
    likelihood function in the form of hidden
    per-pixel binary variables

F( )
F1( )

x
Ensures that we can factor the likelihood even
in presence of occlusions
41
Occlusion-sensitive Likelihoods(for the torso)
Is there any part in front of torso?
Vi
F1( )
Vi
Is there any part behind torso?
42
Occlusion-sensitive model
  • Always prefers the true hypothesis

43
Occlusion-sensitive Loose-limbed Body Model
e R5
X1
X , , ,
X1
X2
X10
Kinematic
X10
X2
X8
X4
X6
X3
X5
X9
X7
Occlusion
Sigal, Black, CVPR06
44
2D Pose Estimation
Most Likely Sample
Most Likely Sample
Most Likely Sample
Distribution
Distribution
Frame 2
Frame 24
Frame 49
Loose-limbed (No Occlusions)
Pictorial Structures
Loose-limbed (Occlusion-sensitive)
45
2D Pose Estimation
Most Likely Sample
Most Likely Sample
Most Likely Sample
Distribution
Distribution
Frame 2
Frame 24
Frame 49
Loose-limbed (No Occlusions)
Pictorial Structures
Loose-limbed (Occlusion-sensitive)
46
2D Pose Estimation
47
Quantitative Evaluation
  • Synchronized marker based motion capture and
    multiocular video dataset
  • Currently downloaded by gt60 groups around the
    world

Workshops in NIPS06, CVPR07
48
Quantitative Comparison
  • All algorithms were run on the same data

Subject specific Motion specific
() Beyond Trees Common Factor Models for 2D
Human Pose Recovery, Lan and Huttenlocher, ICCV
2005.
49
Inferring 2D pose
  • Occlusion-sensitive Loose-limbed body model
    allows us to infer the 2D pose reliably (at about
    50 overhead)
  • Even when motions are complex

Moving Camera
50
Summary so far
Occlusion-sensitive Loose-limbed body model
51
Hierarchical Graphical Model Structure
3D
2D
Image
52
Hierarchical Graphical Model Structure
3D
2D
Image
53
Hierarchical Graphical Model Structure
3D
2D
Image
54
Inferring 3D pose from 2D pose
  • We obtain estimates for the joints automatically
  • We learn direct probabilistic mapping

55
Inferring 3D pose from 2D pose
Mixture of Experts (MoE)
Sminchisescu et al, 05
Waterhouse et al, 96
56
Inferring 3D pose from 2D pose
We want to estimate a distribution/mapping p(3D
Pose2D Pose)
X e Rn
2D Pose
p(YX)
Y e Rm
3D Pose
Problem p(YX) is non-linear mapping, and not
one-to-one
57
Mixture of Experts (MoE)
We want to estimate a distribution/mapping p(3D
Pose2D Pose)
X e Rn
2D Pose
p(YX)
Y e Rm
3D Pose
Solution p(YX) may be approximated by a locally
linear mappings (experts)
58
How well does MoE model work?
  • View only 22 mm
  • Pose only 59 mm
  • Overall 64 mm

59
Hierarchical 3D Pose Estimation from Single View
Monocular Images
Most Likely Sample
Most Likely Sample
Distribution
Distribution
Frame 10
Frame 20
Frame 50
2D Pose Estimation
3D Pose Estimation
Image
60
Hierarchical 3D Pose Estimation from Single View
Monocular Images
2D Pose Estimation
3D Pose Estimation
61
Summary so far
Hidden Markov Model (HMM)
62
Hierarchical Graphical Model Structure
3D
2D
Image
t1
t
t-1
63
Benefits of tracking
Frame 50
Frame 50
Frame 49
Frame 49
2D Pose Estimation
3D Pose Estimation Tracking
64
Other ApplicationMultiocular imagery
Link
Sigal, Bhatia, Roth, Black, Isard, CVPR04
65
Other Applications Vehicle Detection and
Tracking
Sigal, Zhu, Comaniciu, Black, IWCM04
Link
66
Contributions
  • Introduced loose-limbed body model
  • can deal with continuous-state estimation
  • can encode rich set of constraints (occlusions,
    penetrations, action specific kinematics)
  • Introduced tractable inference approach for this
    model
  • Used hierarchical representation and inference to
    manage complexity of the problem
  • Quantitative evaluation of human pose estimation

67
Future WorkBetter inference methods
  • Particle Message Passing does not deal well with
    multiple modes in the distribution
  • Mixture Tracking
  • Inference approaches are relatively slow
  • Hybrid Monte Carlo filtering

Vermaak, Doucet, Perez, CVPR03
Choo, Fleet, ICCV01
68
Future WorkLearning model structure
  • Learning the model structure (useful for deriving
    motion specific models)
  • Kernel Generalized Variance

Bach, Jordan, NIPS03
Walking
Stretching
69
Future Work Deeper Hierarchical Models
3D
2D
Image
70
Future Work Deeper Hierarchical Models
3D
2D
Features
Image
71
Future Work Deeper Hierarchical Models
Scene
3D
2D
Features
Natural Language Processing Visual Grammars
F. Han and S.-C. Zhu, ICCV05
Image
72
Collaborators and Colleagues
  • Michael J. Black
  • - Alex Balan
  • - Stefan Roth
  • - Sidharth Bhatia
  • - Ben Sigelman

- Michael Isard
  • Dorin Comunicu
  • Ying Zhu
  • Horst Haussecker
  • Trista Chen
  • Konstantin Radyushkin

73
Thank you !!!
74
Contributions
  • Introduced loose-limbed body model
  • can deal with continuous-state estimation
  • can encode rich set of constraints (occlusions,
    penetrations, action specific kinematics)
  • Introduced tractable inference approach for this
    model
  • Used hierarchical representation and inference to
    manage complexity of the problem
  • Quantitative evaluation of human pose estimation
Write a Comment
User Comments (0)
About PowerShow.com