Pattern Recognition: Statistical and Neural - PowerPoint PPT Presentation

About This Presentation

Title:

Pattern Recognition: Statistical and Neural

Description:

Set m = 1 and Go To Step 2. 6. Step 2 Determine New Clusters ... is the number of pattern vectors in Clk(m) 8. Step 4 Check for Convergence ... – PowerPoint PPT presentation

Number of Views:22

Avg rating:3.0/5.0

Slides: 38

Provided by: Lude

Category:

more less

Transcript and Presenter's Notes

Title: Pattern Recognition: Statistical and Neural

1
Nanjing University of Science Technology
Pattern RecognitionStatistical and Neural
Lonnie C. Ludeman Lecture 27 Nov 9, 2005
2
Lecture 27 Topics

K-Means Clustering Algorithm Details
K-Means Step by Step Example
ISODATA Algorithm -Overview
4. Agglomerative Hierarchical Clustering
Algorithm Description

3
K-Means Clustering Algorithm
Basic Procedure
Randomly Select K cluster centers from Pattern
Space Distribute set of patterns to the cluster
center using minimum distance Compute new
Cluster centers for each cluster Continue this
process until the cluster centers do not change.
4
Flow Diagram for K-Means Algorithm
5
Step 1 Initialization Choose K initial
Cluster centers M1(1), M2(1), ...
, MK(1)
Method 1 First K samples
Method 2 K data samples selected randomly
Method 3 K random vectors
Set m 1 and Go To Step 2
6
Step 2 Determine New Clusters Using
Cluster centers Distribute pattern vectors using
minimum distance.
Method 1 Use Euclidean distance Method 2 Use
other distance measures
Assign sample xj to class Ck if
Go to Step 3
7
Step 3 Compute New Cluster Centers Using
the new Cluster assignment
Clk(m) m 1, 2, ... , K Compute new
cluster centers Mk(m1) m
1, 2, ... , K using
where Nk , k 1, 2, ... , K
is the number of pattern vectors in Clk(m)
Go to Step 4
8
Step 4 Check for Convergence Using
Cluster centers from step 3 check for convergence
Convergence occurs if the means do not change
If Convergence occurs Clustering is complete and
the results given.
If No Convergence then Go to Step 5
9
Step 5 Check for Maximum Number of Iterations
Define MAXIT as the maximum number of
iterations that is acceptable.
If m MAXIT Then display no convergence
and Stop.
If m lt MAXIT Then mm1 (increment m)
and Return to Step 2
10
Example K-Means cluster algorithm
Given the following set of pattern vectors
11
Plot of Data points in Given set of samples
12
Do the following
13
(a) Solution 2-class case
Initial Cluster centers
Plot of Data points in Given set of samples
14
Initial Cluster Centers
Distances from all Samples to cluster centers
Cl2
Cl1
Cl2
Cl1
Cl2
Cl2
Cl2
With tie select randomly
First Cluster assignment
15
Closest to x2
Closest to x1
Plot of Data points in Given set of samples
16
First Cluster Assignment
Compute New Cluster centers
17
New Cluster centers
Plot of Data points in Given set of samples
18
Distances from all Samples to cluster centers
2
2
Cl2
Cl2
Cl1
Cl1
Cl2
Cl2
Cl1
Second Cluster assignment
19
Old Cluster Center
M2(2)
New Clusters
M1(2)
Old Cluster Center
Plot of Data points in Given set of samples
20
Compute New Cluster Centers
21
ClusterCenters
M2(3)
New Clusters
M1(3)
Plot of Data points in Given set of samples
22
Distances from all Samples to cluster centers
3
3
Cl1
Cl1
Cl1
Cl2
Cl2
Cl2
Cl2
Compute New Cluster centers
23
(b) Solution 3-Class case
Select Initial Cluster Centers
First Cluster assignment using distances from
pattern vectors to initial cluster centers
24
Compute New Cluster centers
Second Cluster assignment using distances from
pattern vectors to cluster centers
25
At the next step we have convergence as the
cluster centers do not change thus the Final
Cluster Assignment becomes
26
Final 3-Class Clusters
Cl3
Cl2
Final Cluster Centers
Cl1
Plot of Data points in Given set of samples
27
Iterative Self Organizing Data Analysis Technique
A
ISODATA Algorithm
Performs Clustering of unclassified quantitative
data with an unknown number of clusters Similar
to K-Means but with ablity to merge and split
clusters thus giving flexibility in number of
clusters
28
ISODATA Parameters that need to be specified
merged at each step
Requires more specified information than for the
K-Means Algorithm
29
ISODATA Algorithm
Final Clustering
30
Hierarchical Clustering
Approach 1 Agglomerative Combines
groups at each level Approach 2 Devisive
Combines groups at each level
Will present only Agglomerative Hierarchical
Clustering as it is most used.
31
Agglomerative Hierarchical Clustering
Consider a set S of patterns to be clustered
S x1, x2, ... , xk, ... , xN
Define Level N by
S1(N) x1
Clusters at level N are the individual pattern
vectors
S2(N) x2
...
SN(N) xN
32
Define Level N -1 to be N 1 Clusters formed by
merging two of the Level N clusters by the
following process.
Compute the distances between all the clusters at
level N and merge the two with the smallest
distance (resolve ties randomly) to give the
Level N-1 clusters as
S1(N-1)
Clusters at level N -1 result from this merging
S2(N-1)
...
SN-1(N-1)
33
The process of merging two clusters at each step
is performed sequentially until Level 1 is
reached. Level one is a single cluster
containing all samples
S1(1) x1, x2, ... , xk, ... , xN
Thus Hierarchical clustering provides cluster
assignments for all numbers of clusters from N to
1.
34
Definition
A Dendrogram is a tree like structure that
illustrates the mergings of clusters at each step
of the Hierarchical Approach.
A typical dendrogram appears on the next slide
35
Typical Dendrogram
36
Summary Lecture 27