Part 1: Bagofwords models - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Part 1: Bagofwords models

Description:

... brain; the cerebral cortex was a movie screen, so to speak, upon which the image ... Background: Hoffman 2001, Blei et al. 2004 ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 49
Provided by: peopleC
Category:

less

Transcript and Presenter's Notes

Title: Part 1: Bagofwords models


1
Part 1 Bag-of-words models
by Li Fei-Fei (UIUC)
2
Related works
  • Early bag of words models mostly texture
    recognition
  • Cula et al. 2001 Leung et al. 2001 Schmid 2001
    Varma et al. 2002, 2003 Lazebnik et al. 2003
  • Hierarchical Bayesian models for documents (pLSA,
    LDA, etc.)
  • Hoffman 1999 Blei et al, 2004 Teh et al. 2004
  • Object categorization
  • Dorko et al. 2004 Csurka et al. 2003 Sivic et
    al. 2005 Sudderth et al. 2005
  • Natural scene categorization
  • Fei-Fei et al. 2005

3
(No Transcript)
4
Analogy to documents
Of all the sensory impressions proceeding to the
brain, the visual experiences are the dominant
ones. Our perception of the world around us is
based essentially on the messages that reach the
brain from our eyes. For a long time it was
thought that the retinal image was transmitted
point by point to visual centers in the brain
the cerebral cortex was a movie screen, so to
speak, upon which the image in the eye was
projected. Through the discoveries of Hubel and
Wiesel we now know that behind the origin of the
visual perception in the brain there is a
considerably more complicated course of events.
By following the visual impulses along their path
to the various cell layers of the optical cortex,
Hubel and Wiesel have been able to demonstrate
that the message about the image falling on the
retina undergoes a step-wise analysis in a system
of nerve cells stored in columns. In this system
each cell has its specific function and is
responsible for a specific detail in the pattern
of the retinal image.
5
(No Transcript)
6
(No Transcript)
7
Representation
2.
1.
3.
8
1.Feature detection and representation
9
1.Feature detection and representation
  • Regular grid
  • Vogel et al. 2003
  • Fei-Fei et al. 2005

10
1.Feature detection and representation
  • Regular grid
  • Vogel et al. 2003
  • Fei-Fei et al. 2005
  • Interest point detector
  • Csurka et al. 2004
  • Fei-Fei et al. 2005
  • Sivic et al. 2005

11
1.Feature detection and representation
  • Regular grid
  • Vogel et al. 2003
  • Fei-Fei et al. 2005
  • Interest point detector
  • Csurka et al. 2004
  • Fei-Fei et al. 2005
  • Sivic et al. 2005
  • Other methods
  • Random sampling (Ullman et al. 2002)
  • Segmentation based patches (Barnard et al. 2003)

12
1.Feature detection and representation
Compute SIFT descriptor Lowe99
Normalize patch
Detect patches Mikojaczyk and Schmid 02 Matas
et al. 02 Sivic et al. 03
Slide credit Josef Sivic
13
1.Feature detection and representation
14
2. Codewords dictionary formation
15
2. Codewords dictionary formation
Vector quantization
Slide credit Josef Sivic
16
2. Codewords dictionary formation
Fei-Fei et al. 2005
17
Image patch examples of codewords
Sivic et al. 2005
18
3. Image representation
frequency
codewords
19
Representation
2.
1.
3.
20
Learning and Recognition
category models (and/or) classifiers
21
2 case studies
  • Naïve Bayes classifier
  • Csurka et al. 2004
  • Hierarchical Bayesian text models (pLSA and LDA)
  • Background Hoffman 2001, Blei et al. 2004
  • Object categorization Sivic et al. 2005,
    Sudderth et al. 2005
  • Natural scene categorization Fei-Fei et al. 2005

22
First, some notations
  • wn each patch in an image
  • wn 0,0,1,,0,0T
  • w a collection of all N patches in an image
  • w w1,w2,,wN
  • dj the jth image in an image collection
  • c category of the image
  • z theme or topic of the patch

23
Case 1 the Naïve Bayes model
w
c
N
Csurka et al. 2004
24
Csurka et al. 2004
25
Csurka et al. 2004
26
Case 2 Hierarchical Bayesian text models
Probabilistic Latent Semantic Analysis (pLSA)
Hoffman, 2001
Latent Dirichlet Allocation (LDA)
Blei et al., 2001
27
Case 2 Hierarchical Bayesian text models
Probabilistic Latent Semantic Analysis (pLSA)
Sivic et al. ICCV 2005
28
Case 2 Hierarchical Bayesian text models
Latent Dirichlet Allocation (LDA)
Fei-Fei et al. ICCV 2005
29
Case 2 the pLSA model
30
Case 2 the pLSA model
Slide credit Josef Sivic
31
Case 2 Recognition using pLSA
Slide credit Josef Sivic
32
Case 2 Learning the pLSA parameters
Observed counts of word i in document j
Maximize likelihood of data using EM
M number of codewords N number of images
Slide credit Josef Sivic
33
Demo
  • Course website

34
task face detection no labeling
35
Demo feature detection
  • Output of crude feature detector
  • Find edges
  • Draw points randomly from edge set
  • Draw from uniform distribution to get scale

36
Demo learnt parameters
  • Learning the model do_plsa(config_file_1)
  • Evaluate and visualize the model
    do_plsa_evaluation(config_file_1)

Codeword distributions per theme (topic)
Theme distributions per image
37
Demo recognition examples
38
Demo categorization results
  • Performance of each theme

39
Demo naïve Bayes
  • Learning the model do_naive_bayes(config_file_2
    )
  • Evaluate and visualize the model
    do_naive_bayes_evaluation(config_file_2)

40
Learning and Recognition
category models (and/or) classifiers
41
Invariance issues
  • Scale and rotation
  • Implicit
  • Detectors and descriptors

Kadir and Brady. 2003
42
Invariance issues
  • Scale and rotation
  • Occlusion
  • Implicit in the models
  • Codeword distribution small variations
  • (In theory) Theme (z) distribution different
    occlusion patterns

43
Invariance issues
  • Scale and rotation
  • Occlusion
  • Translation
  • Encode (relative) location information

Sudderth et al. 2005
44
Invariance issues
  • Scale and rotation
  • Occlusion
  • Translation
  • View point (in theory)
  • Codewords detector and descriptor
  • Theme distributions different view points

Fergus et al. 2005
45
Model properties
  • Intuitive
  • Analogy to documents

46
Model properties
  • Intuitive
  • Analogy to documents
  • Analogy to human vision

Olshausen and Field, 2004, Fei-Fei and Perona,
2005
47
Model properties
  • Intuitive
  • (Could use) generative models
  • Convenient for weakly- or un-supervised training
  • Prior information
  • Hierarchical Bayesian framework

Sivic et al., 2005, Sudderth et al., 2005
48
Model properties
  • Intuitive
  • (Could use) generative models
  • Learning and recognition relatively fast
  • Compare to other methods

49
Weakness of the model
  • No rigorous geometric information of the object
    components
  • Its intuitive to most of us that objects are
    made of parts no such information
  • Not extensively tested yet for
  • View point invariance
  • Scale invariance
  • Segmentation and localization unclear
Write a Comment
User Comments (0)
About PowerShow.com