Title: GENERALIZED DISTANCE TRANSFORM
1GENERALIZED DISTANCE TRANSFORM
- A linear time algorithm and its application in
fitting articulated body models
2OUTLINE
- Distance Transform
- Generalized Distance Transform
- Linear time algorithm for Euclidean distance
- Other distances
- Application of GDT
- Efficient matching of articulated body models
3DISTANCE TRANSFORM
Defined for a set of points P on a grid G, with P
a subset of G
G
p
q
4EXAMPLE
Example
G
p
q
5EXAMPLES
- Chamfer
- Hausdorff
- Hough
- Often used in binary (edge) image matching
6GENERALIZED DISTANCE TRANSFORM
Instead of binary indicator function 1(q),
we can assign a soft membership of all grid
elements to P
f(q) is sampled on the grid G f(q) does not have
to be a 2D image, it can represent any
D-dimensional, discrete space that encodes
spatial relationships through d(p,q)
7APPLICATIONS OF GDT
- Feature matching / tracking
- f(q) can represent a D-dimensional feature vector
at location q, and d(p,q) is a displacement in
the image space - Dynamic Programming / stereo matching
- f(q) can represent the accumulated cost of coming
to state p, and d(p,q) is a transition cost to
move from state p to state qf(q) b(q)
minp(f(p) d(p,q)) - Belief Propagation / MRFs
- Max product (negative log) mj?i(xi)
minxj(?j(xj) ?ji(xj-xi)
?k?N(j)\imk?j(xj))
8WHY SO SLOW?
- Generalized DT computes for each grid point p the
distance to all other grid points q - Its complexity is O(nn) in the number of grid
locations n - Intractable for problems with large number of
discrete locations
9MIN CONVOLUTION
Speed-up by seeing DT as Min-Convolution
10LOWER ENVELOPE
f(q)
- Min Convolution is the Lower Envelop of cones
placed at each p - Example 1
- One Dimension
- Euclidean Distance
-
q
3
2
1
0
Remember in the case of standard distance
transforms all cones would either be rooted at
zero (when there is a pixel) or at infinity (when
there is no pixel)
11LOWER ENVELOPE
- Example 2
- One Dimension
- Squared Euclidean
-
- Once computed, the distance transform on the grid
can be sampled from the lower envelope in linear
time
12COMPUTING THE LOWER ENVELOPE
Add parabola at first grid point
q
13COMPUTING THE LOWER ENVELOPE
Add second parabola at second grid point, and
compute intersection with previous parabola
v1
q
s
14COMPUTING THE LOWER ENVELOPE
Insert height and intersection point in arrays v
and z
v1
v2
z2
15COMPUTING THE LOWER ENVELOPE
Add third parabola at third grid point, and
compute intersection with previous parabola
v1
v2
q
z2
s
16COMPUTING THE LOWER ENVELOPE
Since the new intersection is to the right of the
previous intersection, insert height and
intersection point in arrays v and z
v1
v2
v3
z2
z3
17COMPUTING THE LOWER ENVELOPE
Now consider the case when the new intersection
is to the left of the previous intersection
v1
v2
q
z2
s
18COMPUTING THE LOWER ENVELOPE
Delete previous parabola and its intersection
from arrays v and z and compute intersection with
the last parabola in array v
v1
q
s
19COMPUTING THE LOWER ENVELOPE
Now insert height and intersection point in
arrays v and z
v1
v2
z2
20COMPUTATIONAL COMPLEXITY
- The algorithm has two steps
- 1) Compute Lower Envelope
- For each grid location
- One insertion for parabola and intersection point
- At most one deletion of parabola and intersection
point - Hence, O(n) for n grid locations
- 2) Sample from Lower Envelope
- O(n)
So, total complexity of O(n) !
21ARBITRARY DIMENSIONS
- Consider 2D grid
- Any d-dimensional DT can be performed as d
one-dimensional distance transforms in O(dn) time
is the one-dimensional DT along the column
indexed by x
222D EXAMPLE
23OTHER DISTANCES
- So far only Euclidean distances shown
- Other distances realized as a combination of
linear, quadratic and box distances - Min of any constant number of linear and
quadratic functions, with or without truncation - E.g., multiple segments
- Gaussian approximation with four min convolutions
using box distances
24ILLUSTRATIVE RESULTS
Borrowed from Dan Huttenlocher
- Image restoration using MRF formulation with
truncated quadratic clique potentials - Simply not practical with conventional
techniques, message updates 2562 - Fast quadratic min convolution technique makes
feasible - A multi-grid technique can speed up further
- Powerful formulationlargely abandonedfor such
problems
25Illustrative Results
Borrowed from Dan Huttenlocher
- Pose detection and object recognition
- Sites are parts of an articulated object such as
limbs of a person - Labels are locations of each part in the image
- Millions of labels, conventional quadratic time
methods do not apply - Compatibilities are spring-like
26FITTING OF HUMAN BODY MODELS
27THE GENERAL APPROACH
- Body parts model appearance
- Graph models deformation of linked limbs G(V,E)
with V set of part vertices, E set of edges
connecting vertices - The best fit minimizes the sum of match cost of
each limb and deformation cost of body structure
28DYNAMIC PROGRAMMING
- If Graph has tree-structure we can reformulate in
recursive form -gt Dynamic Programming (DP) - DP is appealing because it gives a global
solution (on a discretized search space) - However, DP runs in polynomial time O(h2n), with
n the number of parts and h the number of
possible locations for each part - h usually is huge, often hundreds of thousands
(x,y,s,?) - If each of (x,y,s,?) has 20 discreet states, then
we have h160000 !!!
29DP FOR TREE-STRUCTURED MODELS
- Match quality for leaf nodes
- Match quality for other nodes
- Best location for root node
30MATCH COST AS DISTANCE TRANSFORM
- Recall Generalized Distance Transform
- Compare to match cost function
31ORIGINAL BODY CONFIGURATION
- Locations of two connected parts
- Joint probability of both parts
- given deformation constraints
32TRANSFORMED BODY CONFIGURATION
- Project distribution over angles onto 2D unit
vector representation - Now all parameters are in a grid and modeled as
multivariate Gaussian with zero mean and
variances specified in diagonal covariance matrix
Dij - Distance in grid is given as
Mahalanobis distance Dij over transformed joint
locations Tij(li) and Tji(lj)
33SUMMARY
- Now linear instead of quadratic time to compute
match costs between child and parent limbs - Did not prune away search space (still global
solution!) - Search space only got a little bigger (about four
times) due to unit vector representation of limb
orientation - 32 discreet angles represented in 11x11 grid
34REFERENCES
- Daniel Huttenlocher
- http//www.cs.cornell.edu/dph/
- Pedro Felzenszwalb
- http//people.cs.uchicago.edu/pff/
- Distance Transforms of Sampled Functions. Pedro
F. Felzenszwalb and Daniel P. Huttenlocher.
Cornell Computing and Information Science
TR2004-1963. - Pictorial Structures for Object Recognition,
Intl. Journal of Computer Vision, 61(1), pp.
55-79, January 2005 (Daniel P. Huttenlocher, P.
Felzenszwalb).
35OTHER REFERENCES
- Stereo Image Restoration
- Efficient Belief Propagation for Early
Vision.Pedro F. Felzenszwalb and Daniel P.
Huttenlocher. International Journal of Computer
Vision, Vol. 70, No. 1, October 2006. - Higher Order Markov Random Fields
- Efficient Belief Propagation with Learned
Higher-Order Markov Random Fields, Proceedings of
ECCV, 2006 (D. Huttenlocher, X. Lan, S. Roth and
M. Black). - www.cs.ubc.ca/nando/nipsfast/slides/dt-nips04.pdf
- Image Segmentation
- Efficient Graph-Based Image Segmentation. Pedro
F. Felzenszwalb and Daniel P. Huttenlocher.
International Journal of Computer Vision, Volume
59, Number 2, September 2004.
36Thanks!
37MATCH COST AS DISTANCE TRANSFORM
- Distance p(x,y) in grid is given as Mahalanobis
distance Mij over model deformation parameters
lj(x,y,s,?)T