Title: Transformation-invariant clustering using the EM algorithm
1Transformation-invariant clustering using the EM
algorithm
Brendan Frey and Nebojsa Jojic
IEEE Trans on PAMI, 25(1) 2003
2Goal
- unsupervised learning of image structure
regardless of transformation - probabilistic description of the data
- clustering as density modeling grouping
similar images together
Invariance
- manifold in data space
- all points on manifold equivalent
- complex even for basic transformations
- how to approximate?
3Approximating the Invariance Manifold
- discrete set of points
- sparse matrices Ti map cannonical feature z into
transformed feature x (observed) - as a Gaussian probability model,
- all possible transformations T enumerated
4This is what it would look like for...
- a 2x3 image with pixel-shift translations
(wrap-around)
z
T1...T6
x
5The full statistical model
- for one feature (one cluster)
- data, given latent repr
- joint of all variables
- Gaussian post-transformation with noise ?
- Gaussian pre-transformation with noise F
- for multiple features (clusters), mixture model
6The full statistical model
- for each feature, have a cannonical mean and
cannonical variance - image contains one of the cannonical features
(mixture model) that has undergone one
transformation
7Inference
and is Gassian
- marginals for inferring parameters T, c, z
8Adapting the rest of parameters
- all learned with EM
- E-step assume known params, infer P(z, T, c)
- M-step update parameters
9Experiments
recovering 4 clusters
4 clusters w/o transform.
10Pre/post transformation noise
11Pre/post transformation noise
mean
variance
single Gaussian model of image
µ
F
transformation-invariant model, no post-t noise
µ
F
?
transformation-invariant model, with post-t noise
12Conclusions
- fast (uses sparse matrices, FFT)
- incorporates pre- and post-transformation noise
- works on artificial data, clustering simple image
sets, cleaning up somewhat contrived examples - can be extended to make use of time series data,
account for more transformations - poor transformation model
- fixed, pre-specified transformations
- must be sparse
- poor feature model
- Gaussian representation of structure
13(No Transcript)
14(No Transcript)