K-means*: Clustering by Gradual Data Transformation presentation

About This Presentation

Transcript and Presenter's Notes

Title: K-means*: Clustering by Gradual Data Transformation

1
K-means Clustering by Gradual Data
Transformation

Mikko Malinen and Pasi Fränti

Speech and Image Processing Unit
School of Computing
University of Eastern Finland

2
K-means clustering

Gradual transformation of data

Data
3
K-means clustering

Iterate between two steps
1. Assignment step
Assign the points to the nearest centroids
2. Update step
Update the location of centroids

4
K-means clustering
5
Example of clustering (s2 dataset)
6
0 done
7
10 done
8
20 done
9
30 done
10
40 done
11
50 done
12
60 done
13
70 done
14
80 done
15
90 done
16
100 done
17

Empty clusters problem

18
Time Complexity

Initialization
Data set transform
Empty clusters removal
K-means
Algorithm total
19
Time Complexity
Fixed k-means

Initialization
Data set transform
Empty clusters removal
K-means
Algorithm total
20
s1 d 2 n 5000 k 15
s2 d 2 n 5000 k 15
s3 d 2 n 5000 k 15
s4 d 2 n 5000 k 15
bridge d 16 n 4096 k 256
missa d 16 n 6480 k 256
house d 3 n34000 k256
thyroid d 5 n 215 k 2
iris d 4 n 150 k 2
wine d 13 n 178 k 3
Datasets
21
Mean square error
Dataset k-means proposed GKM optimal
s1 1.85 1.01 0.89 0.89
s2 1.94 1.52 1.33 1.33
s3 1.97 1.71 1.69 1.69
s4 1.69 1.63 1.57 1.57
bridge 168.2 164.7 164.1 160.7
missa 5.33 5.15 5.34 5.12
house 9.88 9.48 5.94 5.86
thyroid 6.97 6.92 1.52 1.52
iris 3.70 3.70 2.02 2.02
wine 1.92 1.90 0.88 0.88
22
Mean square error vs.number of steps
23
Mean square error vs.number of steps
24
Mean square error vs.number of steps
25
Mean square error vs.number of steps
26
Mean square error vs.number of steps
27
Mean square error vs.number of steps
28
Mean square error vs.number of steps
29
Number of incorrect clusters
All correct
proposed 36 k-means 14
30
Number of incorrect clusters
1 incorrect
proposed 64 k-means 38
31
Number of incorrect clusters
2 incorrect
proposed 0 k-means 34
32
Number of incorrect clusters
3 incorrect
proposed 0 k-means 10
33
Summary

We have presented a clustering method based on
gradual transformation of data and k-means.
Instead of fitting the model to data, we fit the
data to a model.
The proposed method gives better mean square
error than k-means.

Write a Comment

User Comments (0)

About PowerShow.com

K-means*: Clustering by Gradual Data Transformation PowerPoint PPT Presentation