K-means*: Clustering by Gradual Data Transformation - PowerPoint PPT Presentation

About This Presentation
Title:

K-means*: Clustering by Gradual Data Transformation

Description:

... 4 n = 150 k = 2 wine d = 13 n = 178 k = 3 Datasets Mean square error Dataset k-means proposed GKM optimal s1 1.85 1.01 0.89 0.89 s2 1.94 1.52 1 .33 1 ... – PowerPoint PPT presentation

Number of Views:98
Avg rating:3.0/5.0
Slides: 34
Provided by: Mikk97
Category:

less

Transcript and Presenter's Notes

Title: K-means*: Clustering by Gradual Data Transformation


1
K-means Clustering by Gradual Data
Transformation
  • Mikko Malinen and Pasi Fränti
  • Speech and Image Processing Unit
  • School of Computing
  • University of Eastern Finland

2
K-means clustering
  • Gradual transformation of data

Data
3
K-means clustering
  • Iterate between two steps
  • 1. Assignment step
  • Assign the points to the nearest centroids
  • 2. Update step
  • Update the location of centroids

4
K-means clustering
5
Example of clustering (s2 dataset)
6
0 done
7
10 done
8
20 done
9
30 done
10
40 done
11
50 done
12
60 done
13
70 done
14
80 done
15
90 done
16
100 done
17
  • Empty clusters problem

18
Time Complexity

Initialization
Data set transform
Empty clusters removal
K-means
Algorithm total
19
Time Complexity
Fixed k-means

Initialization
Data set transform
Empty clusters removal
K-means
Algorithm total
20
s1 d 2 n 5000 k 15
s2 d 2 n 5000 k 15
s3 d 2 n 5000 k 15
s4 d 2 n 5000 k 15
bridge d 16 n 4096 k 256
missa d 16 n 6480 k 256
house d 3 n34000 k256
thyroid d 5 n 215 k 2
iris d 4 n 150 k 2
wine d 13 n 178 k 3
Datasets
21
Mean square error
Dataset k-means proposed GKM optimal
s1 1.85 1.01 0.89 0.89
s2 1.94 1.52 1.33 1.33
s3 1.97 1.71 1.69 1.69
s4 1.69 1.63 1.57 1.57
bridge 168.2 164.7 164.1 160.7
missa 5.33 5.15 5.34 5.12
house 9.88 9.48 5.94 5.86
thyroid 6.97 6.92 1.52 1.52
iris 3.70 3.70 2.02 2.02
wine 1.92 1.90 0.88 0.88
22
Mean square error vs.number of steps
23
Mean square error vs.number of steps
24
Mean square error vs.number of steps
25
Mean square error vs.number of steps
26
Mean square error vs.number of steps
27
Mean square error vs.number of steps
28
Mean square error vs.number of steps
29
Number of incorrect clusters
All correct
proposed 36 k-means 14
30
Number of incorrect clusters
1 incorrect
proposed 64 k-means 38
31
Number of incorrect clusters
2 incorrect
proposed 0 k-means 34
32
Number of incorrect clusters
3 incorrect
proposed 0 k-means 10
33
Summary
  • We have presented a clustering method based on
    gradual transformation of data and k-means.
    Instead of fitting the model to data, we fit the
    data to a model.
  • The proposed method gives better mean square
    error than k-means.
Write a Comment
User Comments (0)
About PowerShow.com