Title: From Wavelet Sparse Coding
1From Wavelet Sparse Coding To Visual Pattern
Modeling Ying Nian Wu UCLA Department of
Statistics Joint work with Zhangzhang Si,
Haifeng Gong, and Song-Chun Zhu
2- Outline
- Wavelet sparse coding
- Active basis model
- Experiments
Reproducibility http//www.stat.ucla.edu/ywu/Acti
veBasis Matlab/C code, Data
3Gabor wavelets Daugman, 1985
Olshausen, Field, 1996
Localized sine and cosine waves, propagate along
shorter axis
Model for simple cells in primary visual cortex
4Gabor wavelets
Operation local Fourier transform
Local spectrum Local maxima ? edge points
5Representation sparse coding
Olshausen, Field, 1996
raw intensities ? strokes
6Matching pursuit
Mallat, Zhang, 1993
Â
Explaining-away inhibition in step 2
7Shared matching pursuit
Multiple images Fixed scale
8Common template
Each basis element is represented by a bar of the
same length, orientation, and location
9Shape deformations
Fixed scale
Category specific deformable template
Active basis
Image specific deformed template
10Shared matching pursuit
Local maximization in step 1 complex cells,
Riesenhuber and Poggio,1999
11Active basis
12Active basis
Two different scales
13Active basis
Putting multiple scales together
14(No Transcript)
15(No Transcript)
16(No Transcript)
17Orthogonalize
orthogonal
Non-overlapping in spatial or frequency domain
(in practice, allow small overlap)
18Statistical modeling
orthogonal
Strong edges in background
Conditional independence of coefficients
Exponential family model
19Shared sketch pursuit
Template matching score
20Decreasing order in log-likelihood ratio
(template matching score)
21(No Transcript)
22(No Transcript)
23(No Transcript)
24Detection by sum-max maps
25(No Transcript)
26SUM-MAX maps (bottom-up/top-down)
SUM2 operator what cell?
Local maximization complex cells Riesenhuber and
Poggio,1999
Gabor wavelets simple cells Olshausen and Field,
1996
27Template matching by SUM-MAX
SUM2 map at optimal resolution
Multiple resolutions
28(No Transcript)
29(No Transcript)
30(No Transcript)
31Geometric transformation
Scaling, rotation, change of aspect ratio
32(No Transcript)
33Classification
Freund and Schapire, 1995 Viola and Jones, 2004
34(No Transcript)
35(No Transcript)
36(No Transcript)
37Learning from non-aligned training images
38Learning from non-aligned training images
Given the bounding box of one training image
39No given bounding box of any training image
40Weizmann horse images
41(No Transcript)
42(No Transcript)
43(No Transcript)
44(No Transcript)
45(No Transcript)
46(No Transcript)
47(No Transcript)
48(No Transcript)
49(No Transcript)
50Learning part templates or visual words
51(No Transcript)
52Learning moving template from video sequence
PETS data set
53EM/K-mean Clustering
54EM/K-mean Clustering
55EM/K-mean Clustering
56(No Transcript)
57(No Transcript)
58(No Transcript)
59Learning local representatives
60(No Transcript)
61 eps
eps
eps
eps
eps
eps
eps
62(No Transcript)
63MNIST data set
64Including Weizmann data set and INRIA data set
65Active bases as part-templates
Split bike template to detect and sketch tandem
bike
66Is there a tandem bike here?
Is there a wheel nearby?
Is there a wheel here?
Is there an edge nearby?
Is there an edge here?
Soft scoring instead of hard decision
67Where to split the bike template?
68Large deformations
Parts to account for large deformations
69Sparse coding
- Data sparse coding residual
- shape data template residual ?active
basis - image data image primitives residual
?Olshausen-Field - Abstraction and generalization (residual not 0)
- Assign geometric attributes to image primitives
- Shape modeling without preprocessed shape data
- What are the shape primitives or shape Gabors
for generic images? - example wheels