Grammar of Image - PowerPoint PPT Presentation

About This Presentation

Title:

Grammar of Image

Description:

Grammar of Image – PowerPoint PPT presentation

Number of Views:34

Avg rating:3.0/5.0

Slides: 63

Provided by: Zhao8

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: Grammar of Image

1
Grammar of Image

Zhaoyin Jia, 03-30-2009

2
Problems

Enormous amount of vision knowledge
Computational complexity
Semantic gap

Classification, Recognition
3
Task of image parsing
4
Objectives in this paper

Framework for vision
And-Or Graph
Algorithm for this framework
Top-down/bottom-up computation
Generalization of small sample
Use Monte Carlos simulation to synthesis more
configurations
Fill the semantic gap

5
Grammar

Language co-occurance of s is more than chance
Image Parallel T-junction

CONSTANTINOPLE
6
Formulation of grammar

Start symbol S
Non-terminal nodes VN

Reproduction Rule R
Terminal nodes VT

7
Formulation of grammar

Start symbol S
Non-terminal nodes VN

Reproduction Rule R
Terminal nodes VT

8
Formulation of grammar

Start symbol S
Non-terminal nodes VN

Reproduction Rule R
Terminal nodes VT

S NP VP
VP VP PP
VP V NP

9
Formulation of grammar

Start symbol S
Non-terminal nodes VN

Reproduction Rule R
Terminal nodes VT

10
Formulation of grammar

Start symbol S
Non-terminal nodes VN

Reproduction Rule R
Terminal nodes VT

11
Image grammar

Start symbol S
Reproduction Rules
Non-terminal nodes VN
Terminal nodes VT

12
Overlapping parts/Ambiguity
13
Overlapping parts/Ambiguity

Similar color, occlusion, etc.

14
Stochastic Context Free Grammar

For each VN , we have reproduction rules
with a probability associated with each one
Probability of parsing tree
Probability of sentence

15
Stochastic Grammar with Context

From left to right bi-gram model (Markov chain)
a sentence with n words
Non-local relations tree model

16
New issues in Image Grammar

Loss of left to right order region adjacency
graph

17
New issues in Image Grammar

Scaling makes different terminal in parsing tree

18
New issues in Image Grammar

Switch between texture and structure

19
Building the image grammar

Visual Vocabulary
primitives, sketch graph, textons
Relations and configurations
co-occurance, attached, hinged, supported,
occluded
And-or Graph representation
embedding image grammar
Learning /testing the parse graph
find the possible inference

20
Database

Lotus Hill Institute Dataset
636,748 images, 3,927,130 Physical Objects
A few hundred are free

Benjamin Yao, Xiong Yang, and Song-Chun Zhu,
Introduction to a large scale general purpose
ground truth dataset methodology, annotation
tool, and benchmarks. EMMCVPR, 2007
http//www.imageparsing.com/
21
Free Data
http//yoshi.cs.ucla.edu/yao/data/

6 categories, 145 subsets
Manmade Object 75 Nature Object 40
Objects in Scene 6
Transportation 9 UCLA Aerial Image 5
UIUC Sport Activity 10
Outline segmentation of the object

22
Free Data
http//yoshi.cs.ucla.edu/yao/data/

6 categories, 145 subsets
Manmade Object 75 Nature Object 40
Objects in Scene 6
Transportation 9 UCLA Aerial Image 5
UIUC Sport Activity 10
Segmentation of a scene (street)

23
Free Data
http//yoshi.cs.ucla.edu/yao/data/

6 categories, 145 subsets
Manmade Object 75 Nature Object 40
Objects in Scene 6
Transportation 9 UCLA Aerial Image 5
UIUC Sport Activity 10
Physical parts of the object

24
Visual Vocabulary

The Lego Land
Language

25
Visual Vocabulary

function of image primitives
a) geometry transformation
b) appearance
bond between each primitives

26
Visual Vocabulary

Sketch and Texture

S. C. Zhu, Y. N. Wu, and D. B. Mumford, Minimax
entropy principle and its applications to texture
modeling, Neural Computation, vol. 9, no. 8, pp.
16271660, November 1997
27
Primal sketch model
Sketch graph
Input image
Texture pixels
C. E. Guo, S. C. Zhu, and Y. N. Wu, Primal
sketch Integrating texture and structure, in
Proceedings of International Conference on
Computer Vision,2003.
28
Primal sketch model
C. E. Guo, S. C. Zhu, and Y. N. Wu, Primal
sketch Integrating texture and structure, in
Proceedings of International Conference on
Computer Vision,2003.
29
High level visual vocabulary

Cloth collar, left/right sleeves, hands

H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu,
Composite templates for cloth modeling and
sketching, in Proceedings of IEEE Conference on
Pattern Recognition and Computer Vision, New
York, June 2006
30
Relations and configurations

Definition of relation
bonds
relations ,
structure, compatibility
Three types of relations
Bonds and connections
Joints and junctions
Object interactions/semantics
Definition of configurations

31
Relations

Bonds and connections
connects primitives into bigger graphs
intensity/color compatibility

32
Relations

Joint and junctions

33
Relations

Object interactions

34
Configuration

Spatial layout of entities at a certain level
Primal sketch parts object scene

35
Reconfigurable graphs

Treat bonds as random variables address nodes

36
Inference of the configuration

Have the primal sketch of the image
Detect the T-junction
Simulated annealing to infer the Gestalt Law

Red dot connect region Black line known
edge Green line inferred connection
R. X. Gao and S. C. Zhu, From primal sketch to
2.1D sketch, Technical Report, Lotus Hill
Institute, 2006
37
Reconfigurable graphs
Layer extraction
Inferred connection
Source image
T-junction
Ru-Xin Gao1, Tian-Fu Wu, Song-Chun Zhu, and Nong
Sang, Bayesian Inference for Layer
Representation with Mixed Markov Random Field
38
Reconfigurable graphs
R. X. Gao and S. C. Zhu, From primal sketch to
2.1D sketch, Technical Report, Lotus Hill
Institute, 2006
39
And-Or Graph

Parse graph of the image
pt parse tree of vocabulary E relations
Inference the parse graph

Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu,
Recursive top-down/bottom up algorithm for
object recognition, Technical Report, Lotus Hill
Research Institute, 2007.
40
And-Or Graph

Contain all the valid parse graphs
And node, Or node, leaf-node
Relation between children of And node
Parse tree assigning label on Or node

Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu,
Recursive top-down/bottom up algorithm for
object recognition, Technical Report, Lotus Hill
Research Institute, 2007.
41
And-Or Graph

Definition
image primitives
relations at all
level
probability model defined on the And-Or
graph
valid configuration of terminal nodes

42
Stochastic Model on And-Or graph

Terminal (leaf) node
And-Or node
Set of links
Switch variable at Or-node
Attributes of primitives

43
Stochastic Model on And-Or graph

Terminal (leaf) node
And-Or node
Set of links
Switch variable at Or-node
Attributes of primitives

SCFG weigh the frequency at the children of
or-nodes
44
Stochastic Model on And-Or graph

Terminal (leaf) node
And-Or node
Set of links
Switch variable at Or-node
Attributes of primitives

Weigh the local compatibility of primitives
(geometric and appearance)
45
Stochastic Model on And-Or graph

Terminal (leaf) node
And-Or node
Set of links
Switch variable at Or-node
Attributes of primitives

Spatial and appearance between primitives (parts
or objects)
46
Learning And-Or Graph

Learning the vocabulary
Learning the relation set R, given
Learning the parameters , given R and

47
Learning And-Or Graph

Learning the vocabulary , and hierarchic
And-Or Graph
Learning the relation set R, given
Learning the parameters , given R and

Discussed in the paper
48
Learning And-Or Graph
Observation
Learning model

Learning and Pursuing Relation Set R
Start from Stochastic Context Free Graph (a)
Learn the relations that maximally reduce the KL
divergence to the observation (b-e)

J. Porway, Z. Y. Yao, and S. C. Zhu, Learning an
AndOr graph for modeling and recognizing object
categories, Technical Report, Department of
Statistics,2007
49
Learning And-Or Graph

Learning graph parameter
Approximating to
Similar to texture synthesis

S. C. Zhu, Y. N. Wu, and D. B. Mumford, Minimax
entropy principle and its applications to texture
modeling, Neural Computation, vol. 9, no. 8, pp.
16271660, November 1997
50
Case I Rectangle

Nodes Rectangle
Two vanishing points, four edge direction
Rules

F. Han and S. C. Zhu, Bottom-up/top-down image
parsing by attribute graph grammar. Proceedings
of International Conference on Computer Vision,
Beijing,China, 2005.
51
Case I Rectangle

Get the primal sketch of the scene
Find the strong rectangular (bottom-up, red)
Weigh (score) different hypothesis (top-down,
blue)
Weight is the compatibility of the image with the
proposed rectangular (primal-sketch)
Accept the best one
Do the previous 3 steps until all the weigh is
small. (negative)

F. Han and S. C. Zhu, Bottom-up/top-down image
parsing by attribute graph grammar. Proceedings
of International Conference on Computer Vision,
Beijing,China, 2005.
52
Case I Rectangle

Inference process

53
Case I Rectangle
F. Han and S. C. Zhu, Bottom-up/top-down image
parsing by attribute graph grammar. Proceedings
of International Conference on Computer Vision,
Beijing,China, 2005.
54
Case II Human Cloth

Use And-Or graph to generate a matching model
Vocabulary (training dataset)

Matching using the And-or Graph
55
Case II Human Cloth

The And-Or Graph

Novel Configuration

Inference process

Top-down refine the matching using the relation
Localize face, then estimate the parts of the body
Bottom-up a coarse matching of the parts
H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu,
Composite templates for cloth modeling and
sketching, in Proceedings of IEEE Conference on
Pattern Recognition and Computer Vision, New
York, June 2006.
57
Case II Human Cloth

Inference result

Inference result

Hands are not exactly the same find the best
matching in the dataset
H. Chen, Z. J. Xu, Z. Q. Liu, and S. C. Zhu,
Composite templates for cloth modeling and
sketching, in Proceedings of IEEE Conference on
Pattern Recognition and Computer Vision, New
York, June 2006.
59
Case III Recognition
Z. J. Xu, L. Lin, T. F. Wu, and S. C. Zhu,
Recursive top-down/bottomup algorithm for object
recognition, Technical Report, Lotus Hill
Research Institute, 2007.
60
Conclusion