Title: Parsing Images with Context/Content Sensitive Grammars Eran Borenstein, Stuart Geman, Ya Jin, Wei Zhang
1Parsing Images with Context/Content Sensitive
Grammars Eran Borenstein, Stuart Geman, Ya
Jin, Wei Zhang
2- Structured Representation in Neural Systems
- Vision is Hard
- Why is Vision Hard?
- Hierarchies of Reusable Parts
- Demonstration System Reading License Plates
- Generalization Face Detection
3Artificial Intelligence
engineer everything, learn nothing
engineer nothing, learn everything
4Natural Intelligence
simulation and semantics
- Hierarchy and Reusability
ventral visual pathway, linguistics,
compositionality
5- Structured Representation in Neural Systems
- Vision is Hard
- Why is Vision Hard?
- Hierarchies of Reusable Parts
- Demonstration System Reading License Plates
- Generalization Face Detection
6License plate images from Logan Airport
Machines still cant reliably read license plates
7Wafer IDs
Machines cant read fixed-font fixed-scale
characters as well as humans
8Super Bowl
Machines cant find the bad guys at the Super Bowl
9- Structured Representation in Neural Systems
- Vision is Hard
- Why is Vision Hard?
- Hierarchies of Reusable Parts
- Demonstration System Reading License Plates
- Generalization Face Detection
10Instantiation
Vision is content sensitive
11Clutter
Background is structured, and made of the same
stuff!
12- Structured Representation in Neural Systems
- Vision is Hard
- Why is Vision Hard?
- Hierarchies of Reusable Parts
- Demonstration System Reading License Plates
- Generalization Face Detection
13Hierarchical of Reusable Parts
e.g. animals, trees, rocks
e.g. contours, intermediate objects
Bricks
e.g. linelets, curvelets, T-junctions
e.g. discontinuities, gradient
14Hierarchy of Disjunctions of Conjunctions
15Hierarchy of Disjunctions of Conjunctions
16Hierarchy of Disjunctions of Conjunctions
17Hierarchy of Disjunctions of Conjunctions
18Hierarchy of Disjunctions of Conjunctions
19Hierarchy of Disjunctions of Conjunctions
20Hierarchy of Disjunctions of Conjunctions
21Interpretations and Probabilities
Interpretation
22Interpretations and Probabilities
Interpretation
23Interpretations and Probabilities
GRAPHICAL MODEL (Markov)
LIKELIHOOD RATIO (non-Markov)
X
24Generative (Bayesian) Model
25- Structured Representation in Neural Systems
- Vision is Hard
- Why is Vision Hard?
- Hierarchies of Reusable Parts
- Demonstration System Reading License Plates
- Generalization Face Detection
26 Test set 385 images, mostly from Logan Airport
Courtesy of Visics Corporation
27 Architecture
license plates
license numbers (3 digits 3 letters, 4 digits
2 letters)
plate boundaries, strings (2 letters, 3 digits, 3
letters, 4 digits)
generic letter, generic number, L-junctions of
sides
characters, plate sides
parts of characters, parts of plate sides
28 Image interpretation
Original Image
Top object
Top 10 objects
Top 25 objects
29 Image interpretation
Test image
30 Performance
- Six plates read with mistakes (gt98)
- Approx. 99.5 characters read correctly
31 Efficient discrimination Markov versus
Content-Sensitive dist.
Original image
Zoomed license region
Top object under Markov distribution
Top object under content-sensitive distribution
32 Efficient discrimination testing objects
against their parts
Test image
9 active 8 bricks under whole model
1 active 8 brick under parts model
33 Summary
Vision is Content Sensitive
Non-Markovian probability models
Background is Structured, and Made of the Same
Stuff
Objects come equipped with their own background
models
34- Structured Representation in Neural Systems
- Vision is Hard
- Why is Vision Hard?
- Hierarchies of Reusable Parts
- Demonstration System Reading License Plates
- Generalization Face Detection
35 Plates Face Detection
36Face Hierarchy
37(No Transcript)
38Sampling from Data Model
39Sampling faces from the distribution
40- PATTERN SYNTHESIS
- PATTERN RECOGNITION
-
Ulf Grenander