Title: Stochastic Grammars: Overview
1Stochastic Grammars Overview
- Representation Stochastic grammar
- Terminals object interactions
- Context-sensitive due to internal scene models
- Domain Towers of Hanoi
- Requires activities withstrong temporal
constraints - Contributions
- Showed recognition decomposition with veryweak
appearance models - Demonstrated usefulnessof feedback from high
tolow-level reasoning components - Extended SCFG parameters and abstract scene
models
2Expectation Grammars(CVPR 2003)
- Analyze video of a person physically solving the
Towers of Hanoi task - Recognize valid activity
- Identify each move
- Segment objects
- Detect distracters / noise
3System Overview
4Low-Level Vision
- Foreground/background segmentation
- Automatic shadow removal
- Classification based onchromaticity
andbrightness differences - Background Model
- Per pixel RGB means
- Fixed mapping from CDand BD to
foregroundprobability
5ToH Low-Level Vision
Raw Video
Background Model
Foreground Components
Foreground and shadow detection
6Low-Level Features
- Explanation-based symbols
- Blob interaction events
- merge, split, enter, exit, tracked, noise
- Future Work hidden, revealed, blob-part,
coalesce - All possible explanations generated
- Inconsistent explanations heuristically pruned
7Expectation Grammars
ToH -gt Setup, enter(hand), Solve,
exit(hand) Setup -gt TowerPlaced,
exit(hand) TowerPlaced -gt enter(hand, red,
green, blue), Put_1(red, green, blue) Solve
-gt state(InitialTower), MakeMoves,
state(FinalTower) MakeMoves -gt Move(block)
0.1 Move(block), MakeMoves 0.9 Move -gt
Move_1-2 Move_1-3 Move_2-1 Move_2-3
Move_3-1 Move_3-2 Move_1-2 -gt Grab_1,
Put_2 Move_1-3 -gt Grab_1, Put_3 Move_2-1 -gt
Grab_2, Put_1 Move_2-3 -gt Grab_2,
Put_3 Move_3-1 -gt Grab_3, Put_1 Move_3-2 -gt
Grab_3, Put_2 Grab_1 -gt touch_1,
remove_1(hand,) touch_1(), remove_last_1()
Grab_2 -gt touch_2, remove_2(hand,) touch_2(),
remove_last_2() Grab_3 -gt touch_3,
remove_3(hand,) touch_3(), remove_last_3()
Put_1 -gt release_1() touch_1,
release_1 Put_2 -gt release_2() touch_2,
release_2 Put_3 -gt release_3() touch_3,
release_3
- Representation
- Stochastic grammar
- Parser augmented with parameters and internal
scene model
8Forming the Symbol Stream
- Domain independent blob interactions converted
to terminals of grammar via heuristic domain
knowledge - Examples merge (x 0.33) ? touch_1
split (x 0.50) ? remove_2 - Grammar rule can only fire if internal scene
model is consistentwith terminal - Examples cantremove_2 if nodiscs on peg 2 (B)
- Cant move disc tobe on top of smallerdisc (C)
9ToH Example Frames
Explicit noise detection
Objects recognized by behavior, not appearance
10ToH Example Frames
Detection of distracter objects
Grammar can fill in for occluded observations
11Finding the Most Likely Parse
- Terminals and rules are probabilistic
- Each parse has a total probability
- Computed by Earley-Stolcke algorithm
- Probabilistic penalty for insertion and deletion
errors - Highest probability parse chosen as best
interpretation of video
12Expectation Grammars Summary
Semantic Reasoning Stochastic Parser
Feedback
Sensory Input Video
Pre-conceptual Reasoning Object IDs
Action Report Best Interpretation
Memory Parse Tree
Pre-processing Blobs Interaction Events
Given Knowledge Grammar, Scene Model Rules
Learning None (Bg)
13Contributions
- Showed activity recognition and decomposition
without appearance models - Demonstrated usefulness of feedback from
high-level, long-term interpretations to
low-level, short-term decisions - Extended SCFG representational power with
parameters and abstract scene models
14Lessons
- Efficient error recover important for realistic
domains - All sources of information should be included
(i.e., appearance models) - Concurrency and partial-ordering are common, thus
should be easily representable - Temporal constraints are not the only kind of
action relationship (e.g., causal, statistical)
15Representational Issues
- Extend temporal relations
- Concurrency
- Partial-ordering
- Quantitative relationships
- Causal (not just temporal) relationships
- Parameterized activities