Title: Image segmentation using Eigenvectors
1Image segmentation using Eigenvectors
- Speaker Sameer Agarwal
- Course Learning and Vision Seminar
- Date 09/10/2001
2-
- Theoretically I might say there are 327
brightnesses and nuances of color. Do I have
327 No. I have sky, house, and trees. It is
impossible to achieve 327 as such. And yet
even though such droll calculations are
possible--- and implied, say, for the house 120,
the trees 90 and the sky 117 I should at least
have this arrangement and division of the total,
and not, say, 127 and 100 and 100 or 150 and
177. - Laws of Organization in Perceptual Forms
- Max Wertheimer (1923)
3What is Image Segmentation ?
- Partitioning of an image into related regions.
4Why do Image Segmentation ?
- Image Compression - Identify distinct components
within an image and use the most suitable
compression algorithm for each component to get a
higher compression ratio. - Medical Diagnosis - Automatic segmentation of MRI
images for identification of cancerous regions - Mapping and Measurement - Automatic analysis of
remote sensing data from satellites to identify
and measure regions of interest. e.g. Petroleum
reserves.
5How many groups ?
Out of the various possible partitions, which is
the correct one ?
6The bayesian view
- Given prior knowledge about the structure of the
data, choose the partition which is most
probable. - Problem
- How do you specify a prior for knowledge which
is composed of knowledge on multiple scales. e.g. - Coherence
- Symmetry
7A simple implementation
- Assume that the image was generated by a mixture
of multiple models - Segmentation is done in two steps
- Estimate the parameters of the mixture model
- For each point calculate the posterior
probabilities of it belonging to a cluster.
Assign to the cluster with the maximum posterior.
8Why doesnt it work ?
- The model selection problem.
- Number of components ?
- The structure of the components?
- Estimation problem transforms into a hard
optimization problem. No guarantee of convergence
to the global optima.
9Prior Work
- k-means
- Mixture Models (Expectation Maximization)
- k-Medoid
- k-Harmonic
- Self Organizing Maps
- Neural Gas
- Linkage based graph methods.
10Outline of the talk
- The Gestalt approach to perceptual grouping
- Graph theoretic formulation of the segmentation
problem - The normalized cut
- Experimental results
- Relation to other methods
- Conclusions
11The Gestalt approach
- Gestalt a structure, configuration, or pattern
of physical, biological, or psychological
phenomena so integrated as to constitute a
functional unit with properties not derivable by
summation of its parts - The whole is different from the sum of the
parts
12The Gestalt Movement
- Formed by Max Werthheimer, Wolfgang Kohler and
Kurt Koffka. - Rejected structuralism and its assumptions of
atomicity and empiricism. - Adopted a Holistic approach to perception.
13An Example
Emergent properties of a configuration. The
arrangement of several dots in a line gives rise
to emergent properties, such as length,
orientation and curvature, that are different
from the properties of the dots that compose it.
14Gestalt Cues
15And the moral of the story is ..
- Image segmentation based on low level cues cannot
and should not aim to produce a complete final
correct segmentation. - Instead use low-level attributes like color,
brightness to sequentially come up with
hierarchical partitions. - Mid and high-level knowledge can be used to
either confirm or select some partition for
further attention.
16A graph theoretic approach
- A weighted undirected graph G (V,E)
- Nodes are points in the feature space
- Fully connected graph
- Edge weight w(i,j) is a function of the
similarity between nodes i and j. - Task Partition the set V into disjoint sets
V1,..,Vn, s.t. similarity among nodes in Vi is
high and similarity across Vi and Vj is low.
17Issues
- What is a good partition ?
- How can you compute such a partition efficiently ?
18Graph Cut
- G(V,E)
- Sets A and B are a disjoint partition of V
-
- Cut(A,B) is a measure of similarity between the
two groups.
19The temptation
- Cut is a measure of association
-
- Minimizing it will give a partition with the
maximum disassociation. -
- Efficient poly-time algorithms algorithms exist
to solve the MinCut problem. - So why not use it ?
20The problem with MinCut
21The Normalized Cut
- Given a partition (A,B) of the vertex set V.
-
- Ncut(A,B) measures similarity between two groups,
normalized by the volume they occupy in the
whole graph.
22Matrix formulation
- Definitions
- D is an n x n diagonal matrix with entries
- W is an n x n symmetrical matrix
23After some linear algebra we get..
- Subject to the constraints
- y(i) e 1,-b
- ytD10
24Real numbers to the rescue
- Relax the constraints on y, and allow it to take
real value. - Claim
- The real valued MinNcut(G) can then be solved
for by solving the generalized eigenvalue problem - for the second smallest generalized
eigenvector.
25Proof
- Rewrite the equation as
- Here
-
- Lemma 1 is an eigenvector of the
above eigensystem with eigenvalue 0.
26Proof(contd.)
- Lemma 2 is a positive
definite matrix since (D-W) is known to be
positive semi-definite. - Lemma 3 z0 is the smallest eigenvector of
eigensystem. - Lemma 4 z1 is perpendicular to z0
27Proof (contd.)
- Lemma 5 Let A be a real symmetric matrix,
Under the constraint that x is orthogonal to the
j-1 smallest eigenvectors x1,,xj-1,the quotient - is minimized by the next smallest eigenvector.
28Finally..
- By lemma 1 we have y01 is an eigenvector of the
eigensystem with eigenvalue 0. - It is the smallest eigenvector.
- Hence by lemma 2, the second smallest eigenvector
(y1) will minimize the Ncut equation. - By lemma 3 and 4
- z1tz0 y1tD10
29What about the first constraint ?
- The second smallest eigenvector is only an
approximation to the optimal normalized cut. - y1 minimizes
- Y will take similar values for nodes with with
high similarity value. -
30The grouping algorithm
- Given an image, set up the weighted graph
G(E,V). Set the weight on the edges connecting
two nodes as a measure of the similarity between
the nodes. - Solve (D-W)x?Dx for eigenvectors with the
smallest eigenvalues. - Use the second smallest eigenvector to
bipartition the graph.
31Details..
- The eigenvector takes continuous values, how do
use it to segment the image ? - Choose 0 as the splitting point.
- Find the median of the eigenvector and use that
as the splitting point - Search amongst l evenly spaced points for one
which gives the best exact Ncut value. - Impose a stability criterion on the eigenvector.
32Stability ?
- Since we allow the eigenvectors to take real
values. Some eigenvectors might take a smooth
continuous form. - We want vectors that have sharp discontinuities,
indicating separation between regions. - Measure the smoothness of the vector, and stop
partitioning when the smoothness value falls
below a threshold.
33Detail.. (Contd.)
- How do you partition images with multiple
segments ? - 1. The higher order eigenvectors contain
information about sub-partitions. Keep splitting
till Ncut exceeds some pre-specified value. - Problem Numerical Error
- 2. Recursively run the algorithm on successive
subgraphs. -
- Problem Computationally Expensive and the
stability criterion might prevent correct
partitioning.
34Simultanous P-way cut
- Use the first n eigenvectors as n-dimensional
indicator vectors of each point. This is
equivalent to imbedding each point in an
n-dimensional space. - Perform k-means clustering in this new space to
create pgtp clusters. - Use the original 2-way Ncut or a greedy strategy
to merge these p partitions into p partitions.
35 How good is the approximation ?
- The normalized cheeger constant h is defined as
- We know that the second eigenvalue is bounded by
-
- This is only a qualitative indication of the
quality of approximation, it does not say
anything about how close the eigenvector is to
the optimal Ncut vector.
36Example I
37Distance Matrix
38The second generalized eigenvector
39The first partition
40The second generalized eigenvector
41The second partition
42The fourth generalized eigenvector
43The third partition
44Example II
45The structure of the affinity matrix
46Generalized eigenvalues
47The first partition
48The second partition
49The third partition
50The fourth partition
51The fifth partition
52The sixth partition
53Complexity Issues
- Finding Eigenvectors for an n x n matrix is O(n3)
operation. - This is extremely expensive
- One solution is to make the affinity matrix
sparse. Only consider nearby points. Efficient
methods exist for finding eigenvectors of sparse
matrices. - Even with the best methods, its not possible to
perform this task in real time.
54The Nystrom method
- Belongie et. al. made the observation that the
affinity matrix has very low rank i.e. the matrix
has very few unique rows. - Hence its possible to approximate the
eigenvectors of the whole affinity matrix by
linearly interpolating the eigenvectors of a
small randomly sampled sub-matrix. - This method is fast enough to give real-time
performance. - This is also referred to as the Nystrom method in
operator theory.
55Cuts Galore
- The standard Cheeger constant
- defines the ratio cut (Hu Kahng)
- The Feidler value is the solution to the problem
- which known as the average cut.
56Association or Disassociation ?
- Normalized Cut can be formulated as a
minimization of association between clusters OR
as maximization of association within clusters.
57Average Cut is NOT symmetric
- The average does not share the same relationship
with its corresponding notion of normalized
association. - The RHS gives rise to another kind of cut which
we refer to as the average association.
58Relationship between Average,Ratio and Normalized
Cuts
Finding Clumps
Finding Splits
Average Association Assoc(A,A)/A Assoc(B,B)/B Normalized Cut Cut(A,B)/assoc(A,V) Cut(A,B)/assoc(B,V) 2 (assoc(A,A)/assoc(A,V) assoc(B,B)/assoc(B,V)) Average Cut Cut(A,B)/A Cut(A,b)/B
Wx?x (D-W)x?Dx (D-W)x?x
Discrete Formulation
Continuous Formulation
59Perona and Freeman
- Construct the affinity matrix W for the graphs
G(V,E) - Find the eigenvector with the largest eigenvalue.
- Threshold it to get a partition of the nodes of G.
60Shi Malik
- Construct the matricies W and D.
- Find the second smallest generalized eigen vector
of (D-W) i.e. - Threshold y1 to get a partitioning of the graph.
61A closer look
- Define a new matrix N as
- Lemma If v is an eigenvector of N with
eigenvalue ?, then D-1/2v is a generalized
eigenvector of W with eigenvector 1-?. Also - 0lt ? lt1.
-
- Hence Perona and Freeman use the largest
eigenvector of the un-normalized affinity matrix,
and Shi Malik use the ratio of the first two
vectors of the normalized affinity matrix.
62Scott and Longuet-Higgins
- Construct the matrix V whose columns are the k
eigenvectors of W - Normalize the rows of V
- Construct the matrix Q V VT
- Segment points using Q. If i and j belong to the
same cluster, Q(i,j)1, 0 if they belong to
different groups.
63In an ideal world..
- A B would be constant and C would be 0. Then W
can be decomposed as
64And that tells us..
- If V is a 2x2 matrix whose columns are the first
two eigen vectors of W. Then V ODR, where D is
a 2x2 diagonal matrix and R is a 2x2 rotation
matrix. Now if W(i,j) on depends on the
membership of i and j - If v1 is the indicator vector(first eigenvector
of W) of the PF algorithm, then if i and j belong
to the same cluster then v(i) v(j). - If v1 is the indicator vector(second generalized
eigenvector of W) and if i and j belong to the
same cluster then v(i) v(j). - If Q is the indicator matrix in the the SLH
method, then Q(i,j)1, 0 otherwise.
65Non-constant Matricies
- Let A,B be arbitrary positive matrices and C0.
- Let v be the PF indicator vector. If ?(A)1 gt
?(B)1 , then v(i) gt0 for all points belonging to
the first cluster and v(j) 0 for points
belonging to the second cluster. - Let v be the SM indicator vector, then v(i)v(j)
if points i and j belong to the same cluster. - If ?(B)1 gt ?(A)2 and ?(A)1 gt ?(B)2 then
Q(i,j) 1 if i,j belong to the same cluster, 0
otherwise.
66Conclusions
- Normalized cut presents a new optimality
criterion for partitioning a graph into clusters. - Ncut is normalized measure of disassociation and
minimizing it is equivalent to maximizing
association. - The discrete problem corresponding to Min Ncut is
NP-Complete. - We solve an approximate version of the MinNcut
problem by converting it into a generalized
eigenvector problem.
67Conclusions (contd.)
- There are a number of approaches which use the
eigenvectors of matrices related to the affinity
matrix of a graph. - Three of these methods can be shown to be based
on the top eigenvectors of the affinity matrix.
They differ in two ways - 1. Which eigenvectors to look at.
- 2. Whether to normalize the matrix or not ?
68References
- Normalized Cut and Image Segmentation Jianbo
Shi and Jitendra Malik - Segmentation using eigenvectors a unifying view
Yair Weiss
69Acknowledgements
- Serge Belongie for sharing hours of excitement
and details of Linear Algebra and associated
wonders. - Ben Leong for sharing his figures.
- And the music of Tool for keeping me company. ?