Title: Joint Estimation of Image Clusters and Image Transformations
1Joint Estimation of Image Clusters and Image
Transformations
- Brendan J. Frey
- Computer Science, University of Waterloo, Canada
- Beckman Institute and ECE, Univ of Illinois at
Urbana - Nebojsa Jojic
- Beckman Institute, University of Illinois at
Urbana
2 Wed like to cluster images, but The
unknown subjects have unknown
positions
3 The unknown subjects have
unknown positions
unknown rotations
unknown scales unknown levels of
shearing . . .
4Oneapproach
Images
Labor
Normalization
Normalized images
Pattern Analysis
5Anotherapproach
6Yet anotherapproach
Images
Extract transformation-invariant features
- Difficult to work with
- May hide useful features
Transformation- invariant data
Pattern Analysis
7Ourapproach
Images
Joint Normalization and Pattern Analysis
8What transforming an image does in the vector
space of pixel intensities
- A continuous transformation moves an image, ,
along a continuous curve - Our clustering algorithm should assign images
near this nonlinear manifold to the same cluster
9Tractable approaches to modeling the
transformation manifold
- \ Linear approximation
- - good locally, bad globally
- Finite-set approximation
- - good globally, bad locally
10Related work
- Generative models
- Local invariance PCA, Turk, Moghaddam, Pentland
(96) factor analysis, Hinton, Revow, Dayan,
Ghahramani (96) Frey, Colmenarez, Huang (98) - Layered motion Black,Jepson,Wang,Adelson,Weiss(93
-98) - Learning discrete representations of generative
manifolds - Generative topographic maps, Bishop,Svensen,Willia
ms (98) - Discriminative models
- Local invariance tangent distance, tangent prop,
Simard, Le Cun, Denker, Victorri (92-93) - Global invariance convolutional neural networks,
Le Cun, Bottou, Bengio, Haffner (98)
11Generative density modeling
- The goal is to find a probability model that
- reflects the structure we want to extract
- can randomly generate plausible images,
- represents the data using parameters
- ML estimation is used to find the parameters
- We can use class-conditional likelihoods,
- p(imageclass) for recognition, detection, ...
12Mixture of Gaussians
The probability that an image comes from cluster
c 1,2, is P(c) pc
c
13Mixture of Gaussians
c
P(c) pc
The probability of pixel intensities z given that
the image is from cluster c is p(zc) N(z mc ,
Fc)
z
14Mixture of Gaussians
c
P(c) pc
- Parameters pc, mc and Fc represent the data
- For input z, the cluster responsibilities are
- P(cz) p(zc)P(c) / Sc p(zc)P(c)
15Example Hand-crafted model
c
P(c) pc
16Example Simulation
c
P(c) pc
17Example Simulation
P(c) pc
c1
z
p(zc) N(z mc , Fc)
18Example Simulation
P(c) pc
c1
z
p(zc) N(z mc , Fc)
19Example Simulation
c
P(c) pc
20Example Simulation
P(c) pc
c2
21Example Simulation
P(c) pc
c2
z
p(zc) N(z mc , Fc)
22Example Inference
c
z
Images from data set
23Example Inference
P(cz)
c1
0.99
c
c2
0.01
z
Images from data set
24Example Inference
P(cz)
c1
0.02
c
c2
0.98
z
Images from data set
25Example Learning - E step
m1
F1
p1 0.5,
c
m 2
F 2
p 2 0.5,
z
Images from data set
26Example Learning - E step
P(cz)
c1
0.52
m1
F1
p1 0.5,
c
c2
0.48
m 2
F 2
p 2 0.5,
z
Images from data set
27Example Learning - E step
P(cz)
c1
0.51
m1
F1
p1 0.5,
c
c2
0.49
m 2
F 2
p 2 0.5,
z
Images from data set
28Example Learning - E step
P(cz)
c1
0.48
m1
F1
p1 0.5,
c
c2
0.52
m 2
F 2
p 2 0.5,
z
Images from data set
29Example Learning - E step
P(cz)
c1
0.43
m1
F1
p1 0.5,
c
c2
0.57
m 2
F 2
p 2 0.5,
z
Images from data set
30Example Learning - M step
m1
F1
p1 0.5,
c
m 2
F 2
p 2 0.5,
Set m1 to the average of zP(c1z)
z
Set m2 to the average of zP(c2z)
31Example Learning - M step
m1
F1
p1 0.5,
c
m 2
F 2
p 2 0.5,
Set m1 to the average of zP(c1z)
z
Set m2 to the average of zP(c2z)
32Example Learning - M step
m1
F1
p1 0.5,
c
m 2
F 2
p 2 0.5,
Set F1 to the average of diag((z-m1)T
(z-m1))P(c1z)
z
Set F2 to the average of diag((z-m2)T
(z-m2))P(c2z)
33Example Learning - M step
m1
F1
p1 0.5,
c
m 2
F 2
p 2 0.5,
Set F1 to the average of diag((z-m1)T
(z-m1))P(c1z)
z
Set F2 to the average of diag((z-m2)T
(z-m2))P(c2z)
34Example After iterating EM...
c
z
35Adding transformation as a discrete latent
variable
- Say there are N pixels
- We assume we are given a set of sparse N x N
transformation generating matrices G1,,Gl ,,GL - These generate points
- from point
36Transformed Mixture of Gaussians
The probability that the image comes from cluster
c 1,2, is P(c) pc
c
37Transformed Mixture of Gaussians
P(c) pc
c
The probability of latent image z for cluster c
is p(zc) N(z mc , Fc)
z
38Transformed Mixture of Gaussians
P(c) pc
c
p(zc) N(z mc , Fc)
The probability of transf l 1,2, is P(l) rl
l
z
39Transformed Mixture of Gaussians
P(c) pc
c
p(zc) N(z mc , Fc)
P(l) rl
l
z
The probability of observed image x is p(xz,l)
N(x Gl z , Y)
x
40Transformed Mixture of Gaussians
P(c) pc
c
p(zc) N(z mc , Fc)
P(l) rl
l
z
p(xz,l) N(x Gl z , Y)
- rl, pc, mc and Fc represent the data
- The cluster/transf responsibilities,
- P(c,lx), are quite easy to compute
x
41Example Hand-crafted model
G1 shift left and up, G2 I, G3 shift
right and up
c
p1 0.6, p2 0.4
l
z
l 1, 2, 3
r1 r2 r3 0.33
x
42Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c
l
z
x
43Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c1
l
z
x
44Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c1
z
l
x
45Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c1
z
l1
x
46Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c1
z
l1
x
47Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c
l
z
x
48Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c2
l
z
x
49Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c2
z
l
x
50Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c2
z
l3
x
51Example Simulation
G1 shift left and up, G2 I, G3 shift
right and up
c2
z
l3
x
52ML estimation of a Transformed Mixture of
Gaussians using EM
c
- E step Compute P(lx), P(cx) and p(zc,x) for
each x in data - M step Set
- pc avg of P(cx)
- rl avg of P(lx)
- mc avg mean of p(zc,x)
- Fc avg variance of p(zc,x)
- Y avg var of p(x-Gl zx)
l
z
x
53A Tough Toy Problem
- 4 different shapes
- 25 possible
- locations
- cluttered
- background
- fixed distraction
- 100 clusters
- 200 training cases
54Mixture of Gaussians
20 iterations of EM
Mean and first 5 principal components
Transformed Mixture of Gaussians 5 horiz shifts
5 vert shifts 20 iterations of EM
55Face Clustering
- Examples of 400 outdoor images of 2 people
- (44 x 28 pixels)
56Mixture of Gaussians
15 iterations of EM (MATLAB takes 1
minute) Cluster means c 1 c 2 c 3
c 4
57Transformed mixture of Gaussians
- 11 horizontal shifts 11 vertical shifts
- 4 clusters
- Each cluster has 1 mean and 1 variance for each
latent pixel - 1 variance for each observed pixel
- Training 15 iterations of EM
- (MATLAB script takes 10 sec/image)
58Transformed mixture of Gaussians
Initialization Cluster means c 1 c 2 c
3 c 4
59Transformed mixture of Gaussians
1 iteration of EM Cluster means c 1 c
2 c 3 c 4
60Transformed mixture of Gaussians
2 iterations of EM Cluster means c 1 c
2 c 3 c 4
61Transformed mixture of Gaussians
3 iterations of EM Cluster means c 1 c
2 c 3 c 4
62Transformed mixture of Gaussians
4 iterations of EM Cluster means c 1 c
2 c 3 c 4
63Transformed mixture of Gaussians
5 iterations of EM Cluster means c 1 c
2 c 3 c 4
64Transformed mixture of Gaussians
6 iterations of EM Cluster means c 1 c
2 c 3 c 4
65Transformed mixture of Gaussians
7 iterations of EM Cluster means c 1 c
2 c 3 c 4
66Transformed mixture of Gaussians
8 iterations of EM Cluster means c 1 c
2 c 3 c 4
67Transformed mixture of Gaussians
9 iterations of EM Cluster means c 1 c
2 c 3 c 4
68Transformed mixture of Gaussians
10 iterations of EM Cluster means c 1 c
2 c 3 c 4
69Transformed mixture of Gaussians
11 iterations of EM Cluster means c 1 c
2 c 3 c 4
70Transformed mixture of Gaussians
12 iterations of EM Cluster means c 1 c
2 c 3 c 4
71Transformed mixture of Gaussians
13 iterations of EM Cluster means c 1 c
2 c 3 c 4
72Transformed mixture of Gaussians
14 iterations of EM Cluster means c 1 c
2 c 3 c 4
73Transformed mixture of Gaussians
15 iterations of EM Cluster means c 1 c
2 c 3 c 4
74Transformed mixture of Gaussians
20 iterations of EM Cluster means c 1 c
2 c 3 c 4
75Transformed mixture of Gaussians
30 iterations of EM Cluster means c 1 c
2 c 3 c 4
76Mixture of Gaussians
30 iterations of EM Cluster means c 1 c 2
c 3 c 4
77Modeling Written Digits
78A TMG that Captures Writing Angle
TRANSFORMATIONS
C L U S T E R S
- P(lx) identifies the writing angle in image x
79Wrap-up
- MATLAB scripts available at
- www.cs.uwaterloo.ca/frey
- Other domains audio, bioinformatics,
- Other latent image models, p(z)
- factor analysis (prob PCA) (ICCV99)
- mixtures of factor analyzers (NIPS99)
- time series (CVPR00)
- Automatic video clustering
- Fast variational inference and learning