Overview of Clustering presentation

About This Presentation

Transcript and Presenter's Notes

Title: Overview of Clustering

1
Overview of Clustering

Rong Jin

2
Outline

K means for clustering
Expectation Maximization algorithm for clustering
Spectrum clustering (if time is permitted)

3
Clustering

Find out the underlying structure for given data
points

age
4
Application (I) Search Result Clustering
5
Application (II) Navigation
6
Application (III) Google News
7
Application (III) Visualization
Islands of music (Pampalk et al., KDD 03)
8
Application (IV) Image Compression
http//www.ece.neu.edu/groups/rpl/kmeans/
9
How to Find good Clustering?

Minimize the sum of distance within clusters

10
How to Efficiently Clustering Data?
11
K-means for Clustering

K-means
Start with a random guess of cluster centers
Determine the membership of each data points
Adjust the cluster centers

12
K-means for Clustering

K-means
Start with a random guess of cluster centers
Determine the membership of each data points
Adjust the cluster centers

13
K-means for Clustering

K-means
Start with a random guess of cluster centers
Determine the membership of each data points
Adjust the cluster centers

14
K-means

Ask user how many clusters theyd like. (e.g.
k5)

15
K-means

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations

16
K-means

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations
Each datapoint finds out which Center its
closest to. (Thus each Center owns a set of
datapoints)

17
K-means

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations
Each datapoint finds out which Center its
closest to.
Each Center finds the centroid of the points it
owns

18
K-means

Ask user how many clusters theyd like. (e.g.
k5)
Randomly guess k cluster Center locations
Each datapoint finds out which Center its
closest to.
Each Center finds the centroid of the points it
owns

Any Computational Problem?
19
Improve K-means

Group points by region
KD tree
SR tree
Key difference
Find the closest center for each rectangle
Assign all the points within a rectangle to one
cluster

20
Improved K-means

Find the closest center for each rectangle
Assign all the points within a rectangle to one
cluster

21
Improved K-means
22
Improved K-means
23
Improved K-means
24
Improved K-means
25
Improved K-means
26
Improved K-means
27
Improved K-means
28
Improved K-means
29
Improved K-means
30
A Gaussian Mixture Model for Clustering

Assume that data are generated from a mixture of
Gaussian distributions
For each Gaussian distribution
Center ?i
Variance ?i (ignore)
For each data point
Determine membership

31
Learning a Gaussian Mixture(with known
covariance)

Probability

32
Learning a Gaussian Mixture(with known
covariance)

Probability

33
Learning a Gaussian Mixture(with known
covariance)
E-Step
34
Learning a Gaussian Mixture(with known
covariance)
M-Step
35
Gaussian Mixture Example Start
36
After First Iteration
37
After 2nd Iteration
38
After 3rd Iteration
39
After 4th Iteration
40
After 5th Iteration
41
After 6th Iteration
42
After 20th Iteration
43
Mixture Model for Doc Clustering

A set of language models

44
Mixture Model for Doc Clustering

A set of language models

Probability

45
Mixture Model for Doc Clustering

A set of language models

Probability

46
Mixture Model for Doc Clustering

A set of language models

Introduce hidden variable zij zij document di is
generated by the j-th language model ?j.

Probability

47
Learning a Mixture Model
K number of language models
48
Learning a Mixture Model
M-Step
49
Examples of Mixture Models
50
Other Mixture Models

Probabilistic latent semantic index (PLSI)
Latent Dirichlet Allocation (LDA)

51
Problems (I)

Both k-means and mixture models need to compute
centers of clusters and explicit distance
measurement
Given strange distance measurement, the center of
clusters can be hard to compute
E.g.,

52
Problems (II)

Both k-means and mixture models look for compact
clustering structures
In some cases, connected clustering structures
are more desirable

53
Graph Partition

MinCut bipartite graphs with minimal number of
cut edges

CutSize 2
54
2-way Spectral Graph Partitioning

Weight matrix W
wi,j the weight between two vertices i and j
Membership vector q

55
Solving the Optimization Problem

Directly solving the above problem requires
combinatorial search ? exponential complexity
How to reduce the computation complexity?

56
Relaxation Approach

Key difficulty qi has to be either 1, 1
Relax qi to be any real number
Impose constraint

57
Relaxation Approach
58
Relaxation Approach

Solution the second minimum eigenvector for D-W

59
Graph Laplacian

L is semi-positive definitive matrix
For Any x, we have xTLx ? 0, why?
Minimum eigenvalue ?1 0 (what is the
eigenvector?)
The second minimum eigenvalue ?2 gives the best
bipartite graph

60
Recovering Partitions

Due to the relaxation, q can be any number (not
just 1 and 1)
How to construct partition based on the
eigenvector?

61
Spectral Clustering

Minimum cut does not balance the size of
bipartite graphs

62
Normalized Cut (Shi Malik, 1997)

Minimize the similarity between clusters and
meanwhile maximize the similarity within clusters

63
Normalized Cut
64
Normalized Cut

Relax q to real value under the constraint

65
Image Segmentation
66
Non-negative Matrix Factorization

Write a Comment

User Comments (0)

About PowerShow.com

Overview of Clustering PowerPoint PPT Presentation