Title: Sparclus: spatial relationship patternbased hierarchical clustering
1Sparclus spatial relationship pattern-based
hierarchical clustering
Sangkyum kim, xin jin, jiawei han
Dept of computer science Univ of illinois at
urbana-champaign
- SDM08 (April 2008, Atlanta GA)
2Outline
- Background
- Preliminary
- SpIBag
- SpaRClus
- Experimental Results
- Conclusion
3Text Annotation
4Color, Shape, Texture
Enough?
5Bag of Words
- Image A Bag of Words
- Word
- Feature Detection
- Feature Representation
- Codebook Generation
- Apply Document Clustering Algs
Enough?
6Frequent Pattern Mining
- Frequent Pattern A pattern (a set of items,
subsequences, substructures, etc.) that occurs
frequently in a data set - Frequent Item Set Mining Algs
- Apriori, FP-growth,
- Frequent Item Sets
- Minimum Support 2
- Apple, Beer, Coffee, Diaper
- Coffee, Diaper, Beer, Diaper, Apple, Beer,
Apple, Diaper - Apple, Beer, Diaper
7Apriori
Apriori Property All nonempty subsets of a
frequent item set must also be frequent.
1st Scan
2nd Scan
Frequent Item Sets
3rd Scan
- A,B, C, D
- C, D, B, D, A, B, A, D
- A, B, D
8Hierarchical Graph
10, 20, 30, 40
A
B
C
D
20, 30, 40
20, 30, 40
10, 30
10, 20, 30
A,B
A,D
B,D
C,D
20, 30, 40
20, 30
20, 30
10, 30
A,B,D
20, 30
9From Frequent Itemsets to Semantically Meaningful
Visual Patterns
KDD 2007 Junsong Yuan, Ying Wu, Ming Yang EECS,
Northwestern Univ
10Frequent Pattern Mining in Images
A
B
A
C
D
A
B
D
C
E
B
D
- Frequent Item Bags
- Minimum Support 2
- A,B, C, D
- C, D, B, D, A, B, A, D
- A, B, D
Enough?
11Question
- How to do Image Clustering which persists over
Scaling, Translation, and Rotation
transformations? - Solution
- Bag of Items
- Spatial Information
- But HOW???
(a) original pattern
(b) rotated pattern
(c) scaled pattern
(d) translated pattern
12Spatial Pattern
- Define a 3-pattern p as
- p (lta1,a2,a3gt,?,r) where
- r d(c,a3)/d(a1,a2)
- Define a (spatial) pattern as
- a set of 3-patterns
A
B
A
B
A
B
C
C
D
D
133-pattern
- Basic unit of a spatial pattern
- Need an approximation to group 3-patterns
- Group similar 3-patterns
- Need to have same item bags
- ? and r should be within given thresholds
14SpIBag
- SpIBag (Spatial Item Bag Mining)
- Find frequent spatial patterns
- Each image is made up of 3-patterns
- Apply Apriori algorithm
- Special considerations on Joining step
15Hierarchical Graph
I1, I2, I3, I4
p1
p2
p3
p4
I2, I3, I4
I2, I3, I4
I1, I3
I1, I2, I3
p1, p2
p1, p4
p2, p4
p3, p4
I2, I3, I4
I2, I3
I2, I3
I1, I3
16Pruning
- Define an entropy function E to measure the
tightness of a cluster C - Do not join further with a cluster C whose
entropy is small
17Hierarchical Graph
I1, I2, I3, I4
p1
p2
p3
p4
I2, I3, I4
I2, I3, I4
I1, I3
I1, I2, I3
E(p1)0.2566
E(p2)0.2566
E(p3)0.2158
E(p4)0.5632
Given entropy threshold 0.5, we get 4 leaves of
the graph.
18SpaRClus
- SpaRClus (Spatial Relationship Pattern-Based
Hierarchical Clustering) - Apply SpiBag Pruning
- Merge leaves of the graph
- Use the same Entropy function to see the
tightness of two clusters - Make clusters disjoint
- Use the score function of an image to a cluster.
19SpaRClus
p1
p2
p3
p4
I2, I3, I4
I2, I3, I4
I1, I3
I1, I2, I3
E(p1)0.2566
E(p2)0.2566
E(p3)0.2158
E(p1)0.5632
Merge
p1, p2
p3
p4
I2, I3, I4
I1, I3
I1, I2, I3
Disjoint
I2, I3, I4
I1
20Experiments
21Experiments (Kitchen Plan Images)
22Conclusion
- How to do Image Clustering which persists over
Scaling, Translation, and Rotation
transformations? - SpaRClus could solve this problem.
- Proposed a new definition of a spatial pattern in
image. - Proposed score functions based on spatial
patterns. - Applied frequent item bag mining algorithm.
- But
- Improvements needed to apply to the real images.
23Questions ?