Title: Geometric Clustering and the Information Bottleneck
1Geometric Clustering and the Information
Bottleneck
- Key concepts of the NIPS 2003 paper by Susanne
Still, William Bialek, and Leon Bottou - with background from
- The Information Bottleneck Method
- Naftali Tishby, Fernando C. Pereira, William
Bialek
Presented by Mark V. Albert
2Introduction to RDT
Rate Distortion Theory
X
Solved ! Blahut-Arimoto Algorithm
Minimize I(X X) with constraint on distortion
X
p(xx)
Difficulty Need a Distortion function
What features are important?
p(x)
3The Information Bottleneck method
X
Y
min I(x x) ß I(y x)
X
p(xx)
Solved ! general, iterative method
p(yx)
p(x)
Wide variety of applications
4IB method and K-means
With ? lt 1 and n ? 8 this is equivalent to
k-means clustering
i
x
c
Empirical Results ( global optimality
convergence)
Low dimensional K-means 75.8 IB derived 100
max I(x,c) ? I(c,i)
Four High Dimensional Gaussian clusters
IB Method finds iterative equations for p(ci),
p(c), and p(xc)
K-means 37.8 IB derived 78-81
5IB method applications
X position
indices
Geometric Clustering
clusters
6IB method applications
verbs
nouns
Geometric Clustering
context
Semantic Clustering
Periera, Tishby, Lee. Distributional Clustering
of English Words
7IB method applications
words
document
Geometric Clustering
category
Semantic Clustering
Document Categorization
document
words
Slonim, Tishby. Document Clustering using Word
Clusters via the Information Bottleneck Method
8IB method applications
spike trains
stimuli
stimulus features
Geometric Clustering
Semantic Clustering
Document Categorization
Neural Coding
9IB method applications
galaxy
spectra
galaxy clusters
Geometric Clustering
Semantic Clustering
Document Categorization
Neural Coding
Spectral Analysis
Slonim, Somerville,Tishby, Lahav. Objective
Classification of Galaxies Spectra using the
Information Bottleneck Method
10IB method applications
?
?
Geometric Clustering
?
Semantic Clustering
Document Categorization
Neural Coding
Spectral Analysis
???