Towards%20Total%20Scene%20Understanding:%20Classification,%20Annotation%20and%20Segmentation%20in%20an%20Automatic%20Framework - PowerPoint PPT Presentation

About This Presentation

Title:

Towards%20Total%20Scene%20Understanding:%20Classification,%20Annotation%20and%20Segmentation%20in%20an%20Automatic%20Framework

Description:

Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Frame – PowerPoint PPT presentation

Number of Views:137

Avg rating:3.0/5.0

Slides: 65

Provided by: value126

Learn more at: http://vision.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Towards%20Total%20Scene%20Understanding:%20Classification,%20Annotation%20and%20Segmentation%20in%20an%20Automatic%20Framework

1
Towards Total Scene UnderstandingClassification,
Annotation and Segmentation in an Automatic
Framework
Fei-Fei Li (publish under L. Fei-Fei) Computer
Science Dept. Psychology Dept. Princeton
University
2
Li-Jia Li, PhD candidate Computer Science Dept
Stanford University
3
City Travel
Pagoda
Sunrise Sunshine Sun
4
Classification
City Travel
Total Scene Understanding
U
Segmentation
Pagoda
Annotation
Sunrise Sunshine Sun
5
Application
6
Classification
Annotation
Segmentation
Mutually beneficial!
7
Classification
Annotation
Segmentation
class Polo
Athlete Horse Grass Trees Sky Saddle
Horse
Horse
8
Classification
Annotation
Segmentation
class Polo
Sky
Tree
Athlete
Athlete Horse Grass Trees Sky Saddle
Horse
Horse
Horse
Horse
Horse
Horse
Grass
9
Classification
Annotation
Segmentation
class Polo
Athlete Horse Grass Trees Sky Saddle
Horse
Horse
Horse
Horse
Horse
10
Classification
Annotation
Segmentation
class Polo
Related Work
Oliva et al 01 Lazebnik et al 06
Weber et al. 00 Fergus et al 03 Fei-Fei et al
03 Felzenswalb et al 04
Fei-Fei et al 05 Sivic et al 05 Bosch et al. 06
11
Classification
Annotation
Segmentation
Athlete Horse Grass Trees Sky Saddle
Related Work
Blei et al 03
Duygulu et al 02
Alipr (Li et al 03)
Gupta et al 08
Barnard et al 03
12
Classification
Annotation
Segmentation
Horse
Horse
Horse
Horse
Horse
Related Work
Cao Fei-Fei 07 Russell et al. 06 Wang et al.
07 Todorovic et al. 06
Sali et al. 99 Winn et al. 05 Kumar et al. 05
Shi Malik 00
Felzenszwalb Huttenlocher 04
13
Annotation
Classification
Classification
Segmentation
Annotation
Segmentation
Sky
Tree
Athlete
Horse
Grass
Class Polo
Class Polo
Related Work
Tu et al 03
Li Fei-Fei 07
Heitz et al 08
14
Outline

Model
Classification
Learning
Segmentation
Annotation

Recognition Experiment
15
C
S
Athlete Horse Grass Trees Sky Saddle
O
T
X
R
Z
Ar
NF
Nr
Nt
D
16
class Polo
C
Text
Visual
Athlete Horse Grass Trees Sky Saddle
D
Visual Component
Joint distribution of random variable
.
Text Component
17
class Polo
C
Text
Visual
O
D
.
Text Component
17
18
class Polo
C
Text
Visual
O
R
Color Location Texture Shape
NF
D
.
Text Component
19
class Polo
C
Text
Visual
O
X
R
Ar
NF
D
.
Text Component
20
class Polo
C
Text
Visual
Athlete Horse Grass Trees Sky Saddle
O
X
R
Z
Ar
NF
Nr
Nt
D
Connector variable
.
Text Component
21
class Polo
C
Switch variable
Text
Visible
Not visible
Visual
S
Athlete Horse Grass Trees Sky Saddle
Athlete Horse Grass Trees Sky Saddle
O
X
R
Z
Ar
NF
Nr
Nt
D
Connector variable
.
22
class Polo
C
Switch variable
Text
Visible
Not visible
Visual
S
Athlete Horse Grass Trees Sky Saddle
Horse
O
T
X
R
Z
Ar
NF
Nr
Nt
D
Connector variable
.
23
1. Visual Spatial-LTM Single global region
descriptor Corr-LDA Describe the image by blobs
(regions). Our Model Local descriptors
Multiple global region descriptors
Cao Fei-Fei, Spatial-LTM 2007
2. Text Spatial-LTM Does not model
text Corr-LDA Words are generated from pixel
level visual info Our Model Visual vs.
non-visual Switch
Blei Jordan, Corr-LDA 2003
C
S
3. Object distribution Corr-LDA Spatial-LTM
Image based multinomial Our Model Class
dependent multinomial (top down strength of
class).
O
T
X
R
Z
NF
Ar
Nr
Nt
D
Our Model
24
Outline
Model

Learning

Recognition Experiment
25
Learning
Exact Inference is Intractable !
Relationship of the random variables
26
Collapsed Gibbs Sampling
(R. Neal, 2000)
Top-down force
Bottom-up force from visual information
Bottom-up force from text information
Relationship of the random variables
27
There is no object-text correspondence
Scene/Event images from the Internet
28
Our model builds the correspondence
Scene/Event images from the Internet
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
29
However, a big obstacle is many objects always
co-occur together
Scene/Event images from the Internet
?
Athlete Horse Grass Ball
?
?
Athlete Horse Grass Trees Sky Saddle
30
One solution some good initialization of O
C
Scene/Event images from the Internet
S
O
T
X
R
Z
Nr
NF
Ar
Nt
Athlete Horse Grass Trees Sky Saddle
31
Initializing O obtain internet images for each O
Scene/Event images from the Internet
Horse
32
Initializing O obtain internet images for each O
Scene/Event images from the Internet
Object images
33
Initializing O train an object detector for each
O
Object images
Event/Scene images
Scene/Event images
Any object detection segmentation Algorithm
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
34
Initializing O train an object detector for each
O
Object images
Event/Scene images
Scene/Event images
Any object detection segmentation Algorithm
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
35
Initialize O in the scene image by the trained
object detectors
Object images
Event/Scene images
Scene/Event images
Any object detection segmentation Algorithm
Black box object detection segmentation

Black box object detection segmentation
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
36
Initialize O in the scene image by the trained
object detectors
Object images
Event/Scene images
Cao Fei-Fei, 2007
Scene/Event images
?
C
Black box object detection segmentation
Black box object detection segmentation
O
R
X
Ar
Nr

Black box object detection segmentation
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
Our Model
37
Auto-semi-supervised learning Small of
initialized images Large of uninitialized
images
Scene/Event images
Small of initialized images
Large of uninitialized images

C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
Our Model
38
Auto-semi-supervised learning CaoFei-Fei vs
Our Model
Missing
Cao Fei-Fei, 2007
Scene/Event images
?
C
sky
O
sailboat
R
X
water
Ar
Nr
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
Our Model
39
Learning challenges and solutions
Challenges
Solutions

Collapsed Gibbs sampling
Internet images
Automatically initialize O

Intractable coupling
Large amount of data
Co-occur objects words

40
Outline
Model
Learning

Small of automatically initialized images
Large of uninitialized images

Recognition Experiment

Dataset
Learned Model
Results

41
8 Event/Scene Classes
Badminton
Bocce
Croquet
Polo
Remark Tags are not used during testing
42
8 Event/Scene Classes
Rockclimbing
Rowing
Sailing
Snow boarding
43
Learned model O
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
44
Learned model O
45
Learned model O
46
Learned model O
47
Learned model O
48
Learned model R
Athlete
C
S
Grass
O
T
X
R
Z
Horse
Ar
NF
Nr
Nt
D
49
Learned model S
C
S
O
T
X
R
Z
Ar
NF
Nr
Nt
D
50
Classification
Annotation
Segmentation
class Polo
Sky
Tree
Athlete
Athlete Horse Grass Trees Sky Saddle
Horse
Horse
Horse
Horse
Horse
Horse
Grass
51
Classification
Annotation
Segmentation
8 way classification 54
52
Classification
Annotation
Segmentation
Influence of Unlabeled images in learning
Effect of noise in tags
53
Effect of multiple features
Classification
Annotation
Segmentation
54
Classification
Annotation
Segmentation
Alipr Li et al 03
Corr LDA Blei et al 03
55
Classification
Annotation
Segmentation
56
Effect of top-down class context
Model w/o top-down class
Full Model
Horse
57
Effect of top-down class context
Model w/o top-down class
Full Model
Horse
58
Effect of top-down class context
Model w/o top-down class
Full Model
59
Effect of top-down class context
60
Effect of top-down class context
61
Effect of top-down class context
62
Effect of top-down class context
63
Learning
Model

Small of automatically initialized images
Large of uninitialized images
Recognition Experiment

64
Future Work
Top down context
Object vs Object
65
Thank
Prof. Silvio Savarese , Juan Carlos Niebles,
Chong Wang, Barry Chai, Min Sun, Bangpeng Yao,
Hao Su, Jia Deng, anonymous reviewers And You

Write a Comment

User Comments (0)