Title: Clustering and Fuzzy Clustering
1Clustering and Fuzzy Clustering
- CIS 6660.1 seminar
- Xin Wang
- Sep 29,2005
2paper
- A chapter of the book by Witold Pedrycz,
"Knowledge-Based clustering", Wiley, 2005.
3Outline
- Overview of clustering
- Hierarchical Clustering
- Objective function-based Clustering
- Hard Clustering
- Fuzzy clustering
- Extensions of fuzzy clustering
- Summary
4What is clustering
- Data(Patterns)-gtClusters(Classes)
- Desired properties
- homogeneity within clusters
- heterogeneity between clusters
-
5Why clustering
- improves data understanding
- reveals internal structure of data
- Useful for data analysis and interpretation
6How to clutering
- similarity measure classes
- examples
-
- distance, connectivity, intensity
7Categories of Clustering
- Hierarchical clustering
- Objective function-based Clustering
8Hierarchical clustering
- Two modes
-
- Bottom-up
- Top-down
9Hierarchical clustering(cont.)
B
B
A
A
B
A
10Objective function-based Clustering (hard
clustering)
- N patterns in Rn, c clusters, a set of prototypes
v1,v2,vc - Objective function
-
a certain distance between xk and vi
partition matrix
Min Q with respect to v1,v2,,vc and U U
11The definition of U
- If pattern k belongs to cluster i, then
- Uik1
- else
- Uik0
- The entries of U are binary
12Constraints of U
- Each cluster is nontrival Example N8, c3
-
-
- Each pattern belongs to
- a single cluster
-
13Example Classify cracked tiles(hard C-Means)
475Hz 557Hz Ok? ------------- 0.958 0.003
Yes 1.043 0.001 Yes 1.907 0.003 Yes 0.780
0.002 Yes 0.579 0.001 Yes 0.003 0.105 No 0.001
1.748 No 0.014 1.839 No 0.007 1.021 No 0.004
0.214 No Table 1 frequency intensities for ten
tiles.
Tiles are made from clay moulded into the right
shape, brushed, glazed, and baked. Unfortunately,
the baking may produce invisible cracks.
Operators can detect the cracks by hitting the
tiles with a hammer, and in an automated system
the response is recorded with a microphone,
filtered, Fourier transformed, and normalised. A
small set of data is given in TABLE 1 (adapted
from MIT, 1997).
14- Place two cluster centres (x) at random.
- Assign each data point ( and o) to the nearest
cluster centre (x)
15- Compute the new centre of each class
- Move the crosses (x)
16Iteration 2
17Iteration 3
18Iteration 4 (then stop, because no visible
change) Each data point belongs to the cluster
defined by the nearest centre
19U 0.0000 1.0000 0.0000 1.0000
0.0000 1.0000 0.0000 1.0000 0.0000
1.0000 1.0000 0.0000 1.0000
0.0000 1.0000 0.0000 1.0000 0.0000
1.0000 0.0000
- The membership matrix U
- The last five data points (rows) belong to the
first cluster (column) - The first five data points (rows) belong to the
second cluster (column)
20Challenge to hard clustering
21Why fuzzy clustering?
- In real applications there is very often no
sharp boundary between clusters. - Clusters may not be well separated for noise or
lack of discriminatory power of feature space in
which the patterns are represented. - Fuzzy clustering can deal with unsharp or
overlapping cluster boundaries.
22Hard Fuzzy clustering
- Hard Clustering
- crisp clusters
- Each data belongs to one cluster.
- Fuzzy Clustering
- Each data belongs to more than one cluster.
- membership grade, partial membership
23Objective function-based Clustering (Fuzzy
Clustering)
- N patterns in Rn, c clusters, a set of prototypes
v1,v2,vc - Objective function
-
a certain distance between xk and vi
Fuzzy partition matrix
Fuzzification factor
24Definition Constraints of U
- U a matrix with entries confined to the unit
interval - Constraints
- The Clusters are nontrival.
- The total membership grades sum to 1.
25Fuzzy C-Means(FCM) algorithm description
- introduced by Dunn in 1974 , improved by
Bezdek in 1981 - Step1 (Initialization phase)
- (a) select values of c (number of clusters),
- m (the fuzzification factor), and e(the
- termination criterion).
- (b) Choose the distance fucntion.
- (c) Initialize(randomly) the partion matrix.
-
26Fuzzy C-Means(FCM) algorithm description(cont.)
- Step2(main iteration loop)
- compute prototypes of clusters
- compute the partition matrix
- Step3
- If stopping criterion is met, i.e.
-
- then stop.
- Else, go to step 2.
27Computing matrix prototypes
- For each pattern t1,2,,N, augmented fuction
- ?denoting a Lagrange multiplier
-
-
28Example for FCM
Each data point belongs to two clusters to
different degrees
29- Place two cluster centres
- Assign a fuzzy membership to each data point
depending on distance
30- Compute the new centre of each class
- Move the crosses (x)
31Iteration 2
32Iteration 5
33Iteration 10
34Iteration 13 (then stop, because no visible
change) Each data point belongs to the two
clusters to a degree
35U 0.0025 0.9975 0.0091 0.9909
0.0129 0.9871 0.0001 0.9999 0.0107
0.9893 0.9393 0.0607 0.9638
0.0362 0.9574 0.0426 0.9906 0.0094
0.9807 0.0193
- The membership matrix U
- The last five data points (rows) belong mostly to
the first cluster (column) - The first five data points (rows) belong mostly
to the second cluster (column)
36Cluster Validity
- What is the optimal number of clusters?
- Partition Index
- 1/c, 1
- Partition Entropy
- 0, ln(c)
37Extensions of fuzzy clustering
- Fuzzy C Varieties (Bezdek et al. 1981)
- points-gtr-dimensional variety
- Possibilistic Clustering(Krishnampuram 1993,
keller 1996) drop the unity constraint - Noise Clustering(Ohashi 1984, Dave 1991)
- localize the noise and place it in a single
auxiliary cluster(end up with c1 clusters)
38Summary
- Hierarchical Objective function-based
clustering
39