Cluster analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Cluster analysis

Description:

Minimizing the withing clusters variance is equivalent to maximize the between ... Euclidean distances not appropriate for eliptical clusters ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 40
Provided by: Com144
Category:

less

Transcript and Presenter's Notes

Title: Cluster analysis


1
Cluster analysis
2
  • Partition Methods
  • Divide data into disjoint clusters
  • Hierarchical Methods
  • Build a hierarchy of the observations and deduce
    the clusters from it.

3
K-means
4
Criteria
5
Same criteria with multivariate data
6
Justifying the criteria
  • Anova decomposition of the variance.
  • Univariate

SSTSSWSSB
Multivariate
Minimizing the withing clusters variance is
equivalent to maximize the between clusters
variance (the difference between clusters).
7
K-means algorithm
8
Number of clusters
9
Consequences of standardization
10
Ruspini example
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
Problems of k-means
  • Very sensitive to outliers
  • Euclidean distances not appropriate for eliptical
    clusters
  • It does not give the number of clusters.

16
Hierarchical Algoritms
17
Agglomerative algorithms
18
Nearest neighbour distance
19
Farthest neighbour distance
20
Average distance
21
Centroid method distance
22
Wards method distance
23
Dendograms
24
Example
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
Problems of hierarchical cluster
  • If n is large, slow. Each time n(n-1)/2
    comparisons.
  • Euclidean distances not always appropriate
  • If n is large, dendogram difficult to interpret

33
Clustering by variables
34
(No Transcript)
35
Distances between quantitative variables
36
Distances between qualitative variables
37
Similarity between attributes
38
(No Transcript)
39
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com