Title: Objective Function Based Fuzzy Clustering
1Objective Function Based Fuzzy Clustering
- Haimin Zhang
- Advisor Professor Dong-Guk Shin
- Department of Computer Science and Engineering
- University of Connecticut
- May 2005
2Presentation Outline
- Introduction
- Fuzzy clustering
- Fuzzy C-Means and its problems
- Improved Algorithms
- Gustafson-Kessel algorithm
- Fuzzy C-Varieties algorithm
- Extended GK with volume prototypes
- Determine number of clusters---cluster merging
- Research Lines
3Introduction to Fuzzy Clustering
- Motivation
- Why do we need clustering?
- Why do we need fuzzy clustering?
4Introduction to Fuzzy Clustering
- Input unlabeled dataset
- n is the number of data points
- , p is features dimension
for each data point. - Output
- Membership Matrix U(cn), c is the number of
clusters - Prototype Matrix V(cp), each column denotes one
clusters prototype (cluster center). - Goal
- Maximize the similarity within clusters
- Minimize the similarity between clusters
5One Example
Membership Matrix Cluster number C4
6Objective Function Based Fuzzy Clustering Methods
- An objective function measures the overall
dissimilarity within clusters - By minimizing the objective function we can
obtain the optimal partition - Fuzzy C-Means algorithm is the most popular
objective function based fuzzy clustering method,
it is also the common base for most of the newly
developed objective function based fuzzy
clustering methods.
7Fuzzy C-Means Clustering
- Objective Function
- Alternative Minimizing Procedure
- Termination Criterion
8Problems in FCM method
9Problems in FCM method
- Clusters have non-spherical shapes.
Ellipsoids
Linear Varieties (lines)
N5000, C2
N2000, C2
10Problems in FCM method
- Clusters have different sizes and densities.
N2000, C2
N2000, C2
11Problems in FCM method
- Determine number of clusters
12Gustafson-Kessel Algorithm
- Objective function becomes
- gives the shape of the cluster i, by adding
it to the distance calculation, it reduces the
distance along the long axis and magnifies the
distance along the short axis, which changes the
cluster shape into a sphere with same volume. - Gustafson-Kessel algorithm performs well for
clusters with ellipsoidal shapes, it has same
problem as FCM when clusters have different sizes
and densities.
is the covariance matrix for cluster i
13Gustafson-Kessel Algorithms
14Fuzzy C-Varieties Algorithm
- Objective function becomes
- s give the principle scatter directions of
cluster i, by extracting the distances along the
principle scatter directions from the Euclidian
distances, it ignores the distance along the
principle scatter directions then the center of
the cluster is a linear variety(line in
2-dimensional case). - Fuzzy c-Varieties algorithm can efficiently
detect the linear varieties substructure in the
data. However it tends to grab points from other
clusters along its principle directions, because
it ignores the distance along those directions.
15Fuzzy C-Varieties Algorithm
Fuzzy c-Means
Fuzzy c_Varieties
N2000, C2
16Clustering with Volume prototypes
- Point prototype
- No 0/1 membership degree is assigned no matter
how close one point is to the center of one
cluster. - Volume prototype
- When a data point is very close to a cluster
center, it can be considered as fully belong to
that cluster. - Clustering with volume prototype can take
clusters sizes into account, which improves the
performance when clusters have different sizes. - Clustering with volume prototype can discard the
effect to one cluster from those data points
which are far away from that cluster, which
improves the performance when clusters have
different densities.
17Extended GK with Volume prototypes
- Objective function becomes
- Membership degrees assignment rules
-
18Extended GK with Volume prototypes
Fuzzy c-Means
Fuzzy c_Varieties
N2000, C2
19 Number of Clusters ---Cluster Merging I
- Merging by closeness
- The closeness between two clusters is calculated
by the ratio between their radius and their
distance. - If ,two clusters and are
completely separated. - If , two clusters and are merged
to form a new cluster with and
20 Number of Clusters ---Cluster Merging I
- This method is compatible with the objective
function. However, this method is only suitable
for spherical shaped clusters. - Idea of improvement
- Map the distance and radius to each direction,
calculate the similarity ratios in each direction
and average(weighted average) them.
21 Number of Clusters ---Cluster Merging II
- Merging by similarity
- This method does not depends on clusters shapes
and sizes.However the value may be effected by
those data points that are far away from both of
them, especially when the two clusters have lower
density than others. - Idea of improvement
- Drop the data points whose membership degrees
are less than a threshold for both of the two
clusters in the calculation.
22Future Research Lines
- Extended other fuzzy clustering algorithms with
volume prototypes (Gath-Geva algorithm,fuzzy
c-shell, fuzzy c-rings etc.) - Develop method to decide optimal fuziffier m or
find new transform functions that can introduce
fuzziness to the model - Incorporate datas distribution into fuzzy
clustering methods. - Parallel algorithms and implementations.
- Apply fuzzy clustering to micro-array data
analysis.
23References
- 1. Kaymak, U. Setnes, M. Fuzzy clustering with
volume prototypes and adaptive cluster merging.
Fuzzy Systems, IEEE Transactions on, Volume
10, Issue 6, Dec. 2002 - 2. Xuejian, Xiong Kap Luk, Chan Kian Lee, Tan
Similarity-driven cluster merging method for
unsupervised fuzzy clustering . ACM International
Conference Proceeding Series, Proceedings of the
20th conference on Uncertainty in artificial
intelligence 2004 - 3. J. C. Bezdek, Pattern Recognition With
Fuzzy Objective Function. New York Plenum, 1981. - 4. R. N. Dave, Use of the adaptive fuzzy
clustering algorithm to detect lines in digital
images, Intell. Robots Comput. Vision VIII, vol.
1192, pt. 2, pp. 600-611, Nov. 1989. - 5. D. E. Gustafson and W.C. Kessel, Fuzzy
clustering with a fuzzy covariance matrix, in
Proc. IEEE Conf. Decision Contr., San Diego, CA,
1979. - 6. F. Klawonn, F. Höppner What is Fuzzy About
Fuzzy Clustering? -- Understanding and Improving
the Concept of the Fuzzifier. In M.R. Berthold,
H.-J. Lenz, E. Bradley, R. Kruse, C. Borgelt
(eds.) Advances in Intelligent Data Analysis V.
Springer, Berlin (2003), 254-264.
24Questions and Comments
Thanks!