Clustering and Visual Data Analysis - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Clustering and Visual Data Analysis

Description:

Objective function that expresses our notion of interestingness for this data ... (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley & Sons, 2000 ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 25
Provided by: axk
Category:

less

Transcript and Presenter's Notes

Title: Clustering and Visual Data Analysis


1
Clustering and Visual Data Analysis
  • Ata Kaban
  • The University of Birmingham

2
The Clustering Problem
Unsupervised Learning
Data (input)
Interesting structure (output)
  • Interesting
  • contains essential characteristics
  • discards unessential details
  • provides a summary the data (e.g. to visualise on
    the screen)
  • compact
  • interpretable for humans
  • etc.

Objective function that expresses our notion of
interestingness for this data
3
One reason for clustering of data
  • Here is some data
  • Assume you transmit the coordinates of points
    drawn randomly from this data set
  • You are only allowed to send a small (say 2 or
    3) bits per point
  • So it will be a lossy transmission
  • Loss sum of squared errors between the
    original and the decoded coordinates
  • What encoder / decoder will loose the least
    information?

4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
Formalising
  • What objective does K-means optimise?
  • Given an encoder function ENCRT?1..K
  • (T is dimension of data, K is number of clusters)
  • Given a decoder function DEC1..K?RT
  • DISTORTIONsum nxn-DECENC(xn)2
  • where DEC(k)µk are centers of clusters, k1..K
  • So, DISTORTIONsumnxn-µENC(xn)2, where n goes
    from 1 to N, the number of points

11
The minimal distortion
  • DISTORTIONsumnxn-µENC(x_n)2
  • This is minimised.
  • What properties do µ1,µK satisfy for that?
  • 1) each point xn must be encoded by its nearest
    center, otherwise DISTORTION could be reduced by
    replacing ENC(xn) with the nearest center of xn.
  • 2) each µk must be the centroid of its own points

12
  • If N is the known number of points and K the
    desired number of clusters, the K-means algorithm
    is
  • Begin
  • initialize ?1, ?2, ,?K (randomly selected)
  • do classify n samples according to nearest
    ?i
  • recompute ?i
  • until no change in ?i
  • return ?1, ?2, , ?K
  • End

13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
Other forms of clustering
  • Many times, clusters are not disjoint, but a
    cluster may have subclusters, in turn having
    sub-subclusters.
  • ?Hierarchical clustering

17
  • Given any two samples x and x, they will be
    grouped together at some level, and if they are
    grouped a level k, they remain grouped for all
    higher levels
  • Hierarchical clustering ? tree representation
    called dendrogram

18
  • The similarity values may help to determine if
    the grouping are natural or forced, but if they
    are evenly distributed no information can be
    gained
  • Another representation is based on set, e.g., on
    the Venn diagrams

19
  • Hierarchical clustering can be divided in
    agglomerative and divisive.
  • Agglomerative (bottom up, clumping) start with n
    singleton cluster and form the sequence by
    merging clusters
  • Divisive (top down, splitting) start with all of
    the samples in one cluster and form the sequence
    by successively splitting clusters

20
  • Agglomerative hierarchical clustering
  • The procedure terminates when the specified
    number of cluster has been obtained, and returns
    the cluster as sets of points, rather than the
    mean or a representative vector for each cluster

21
The problem of the number of clusters
  • Typically, the number of clusters is known.
  • When its not, that is a hard problem called
    model selection. There are several ways of
    proceed.
  • A common approach is to repeat the clustering
    with c1, c2, c3, etc.

22
What did we learn today?
  • Data clustering
  • K-means algorithm in detail
  • How K-means can get stuck and how to take care of
    that
  • The outline of Hierarchical clustering methods

23
Pattern ClassificationFind out more
here!Pattern Classification (2nd ed) by R. O.
Duda, P. E. Hart and D. G. Stork, John Wiley
Sons, 2000
24
Pattern ClassificationFind out more
here!Pattern Classification (2nd ed) by R. O.
Duda, P. E. Hart and D. G. Stork, John Wiley
Sons, 2000
Write a Comment
User Comments (0)
About PowerShow.com