Title: Segmentation by Model Fitting
1Segmentation by Model Fitting
- Choose a parametric object (form) to represent
spatially distributed collections of tokens. - Lines
- Circles, ellipses
- Whatever
- Three interesting questions
- Which form best represents these tokens?
- How to map tokens to objects?
- How many objects are there?
2Hough Transform
The Notion It is often simpler to transform
a problem to another domain -- solve it --
and come back. Weve been doing this with
time- and frequency-domain concepts
(Fourier) all our lives. Hough Transforms
exploit the fact that a large analytic curve
may encompass many pixels in image space, but
be characterized by only a few parameters.
3Hough
Advantage The Hough Transform can detect
lines or curves that are very broken (after
initial edge detection, for example).
Disadvantage HTs can only detect lines or
curves that analytically specifiable, or that
can be represented in a template-like form
(GHT, Ballard). Even for the GHT, the
implementation is a bit awkward, and you have
to know what youre looking for. So the
Hough Transform is primarily a hypothesize and
test tool.
4HT Basic Line Finding
5 Consider a straight line. The HT maps all the
points on the line onto a single point in the
parameter space. (see drawing) A line in
image space and a point in parameter space are
a Hough Transform pair. also..... A point
in image space and a line in parameter space are
a Hough Transform pair.
6 Each point in image space generates a line in
parameter space. The intersection of a
number of lines in that space indicates
support from many different image points for a
common line having the corresponding
parameters. Quantize the parameter space (an
array) and let the generation of a line be
the increment of a counter at all cells in
the parameter space that lie along the line.
Histogram the resulting parameter space
(accumulators). Peaks in this accumulator
array occur at parameter locations
corresponding to lines with significant support.
7TOKENS
VOTING ARRAY
8HT Rho-Theta Space
IMAGE SPACE
9HT Rho-Theta
PARAMETER SPACE
10 We can restrict the attention in the Hough
space, to make it more efficient and more
reliable, by using local edge orientation
information. We can use gradient magnitude (or
similar) for weighted voting of edge
points. Although it is easiest to describe a
slope-intercept Hough space, it is often best
to use rho-theta space to avoid the problems
that occur when the line is vertical.
In rho-theta space, the locus of points along
which an image point votes is a piece of
sinusoid over (0,p). (see drawing)
11 A line in image space and a point in parameter
space are still a Hough Transform pair. but
now.. A point in image space and a sinusoidal
arc segment in parameter space are a Hough
Transform pair. In high aspect ratio images,
things can be tricky because closeness in
Hough space may not mean closeness in image
space.
12TOKENS
VOTING ARRAY
13TOKENS
VOTING ARRAY
14The most-populous bin gets fewer votes as we
perturb the points from a perfect line (left).
Randomly placed points can cause large spurious
vote totals in the most-populous bin (right).
15HT Circle Detection
We use the HT to detect circles (sets of
cocircular points) as follows. Equation of
a circle... This is for a circle of
radius r with center (a, b ). The parameter
space is three dimensional (a, b, r ). Each
point in image space gives rise to a locus of
voting points in the 3D Hough space that will
be a surface.
16 For a given radius, the locus of possible
circle centers for a given image point will
itself be a circle of radius r centered on
the given point. Therefore, in the 3D Hough
space, the locus of possible parameter values
sweeps out the surface of an inverted cone
with axis parallel to the r axis and vertex at
(a, b, 0). Local estimates of curvature (or
even sign of curvature) and edge strength can
be used to improve the efficiency of the voting
and Hough space interrogation.
17Generalized Hough Transform
What can we do when the curve we want to detect
is not easily described parametrically?
By this, we mean, it cannot be captured in a
relatively small number of parameters.
Recall, the dimensionality of the Hough space
equals the number of parameters! The GHT
constructs a parametric description of an
arbitrary shape based on a learning process.
This parametric description is not, in general,
compact.
18 We will begin by assuming the size, shape, and
rotation (orientation) of the region is known
a priori. (Or that we want only to detect
instances of a given size and orientation.
The voting space is (equivalent to) image
space, 2D, in the case of known size and
rotation. We will see how to deal with
unknown orientation and size shortly --
with a 4D Hough space.
19The list of ( , ) pairs for a given and
constitute a partial characterization of the
shape.
An arbitrary reference point inside the shape.
The length of the j-th line from the reference
point to the shape perimeter, intersecting at
a point of tangent angle ø.
The angle of the (current) tangent(s) to the
perimeter.
The orientation of the j-th line segment.
20 By sweeping the tangent angle (ø) over the
range (0,2p) in some reasonable quantization
(!), we build what is called the R-table
(reference table) description of the shape.
Each pixel x (say, a detected edge point) with
local orientation ø provides evidence
(votes for) reference points at the set of
locations indicated by the list in the
R-table for that tangent direction...
21 A vote is cast for each (r , ) pair in the
list for that ø value. The voting space is
isomorphic to image space.
Again, this assumes known size and orientation
for all appearances of the shape. After
all the edge points have voted for all of their
possible reference points, we interrogate the
voting space for significant local maxima.
These suggest possible detections of the
shape of interest.
22 If we have not prenormalized for size (S) and
rotation ( ) then our voting space is four
dimensional and the reference location
receiving the vote(s) for a given edge point
and R-table entry is
Now, we interrogate the 4D accumulator array to
recover likely locations, scale, and
orientation for appearances of the shape.
23 This is really a fancy form of a template match
-- but one that is far more robust than a
straightforward template matching algorithm.
Selecting among multiple possible shapes
requires multiple R-tables, multiple voting
spaces. But, so does looking for lines and
circles in the same image....