Stochastic Grammars: Overview - PowerPoint PPT Presentation

About This Presentation

Title:

Stochastic Grammars: Overview

Description:

Stochastic Grammars: Overview Representation: Stochastic grammar Terminals: object interactions Context-sensitive due to internal scene models Domain: Towers of Hanoi – PowerPoint PPT presentation

Number of Views:93

Avg rating:3.0/5.0

Slides: 15

Provided by: DavidM490

Learn more at: https://cpl.cc.gatech.edu

Category:

more less

Transcript and Presenter's Notes

Title: Stochastic Grammars: Overview

1
Stochastic Grammars Overview

Representation Stochastic grammar
Terminals object interactions
Context-sensitive due to internal scene models
Domain Towers of Hanoi
Requires activities withstrong temporal
constraints
Contributions
Showed recognition decomposition with veryweak
appearance models
Demonstrated usefulnessof feedback from high
tolow-level reasoning components
Extended SCFG parameters and abstract scene
models

2
Expectation Grammars(CVPR 2003)

Analyze video of a person physically solving the
Towers of Hanoi task
Recognize valid activity
Identify each move
Segment objects
Detect distracters / noise

3
System Overview
4
Low-Level Vision

Foreground/background segmentation
Automatic shadow removal
Classification based onchromaticity
andbrightness differences
Background Model
Per pixel RGB means
Fixed mapping from CDand BD to
foregroundprobability

5
ToH Low-Level Vision
Raw Video
Background Model
Foreground Components
Foreground and shadow detection
6
Low-Level Features

Explanation-based symbols
Blob interaction events
merge, split, enter, exit, tracked, noise
Future Work hidden, revealed, blob-part,
coalesce
All possible explanations generated
Inconsistent explanations heuristically pruned

7
Expectation Grammars
ToH -gt Setup, enter(hand), Solve,
exit(hand) Setup -gt TowerPlaced,
exit(hand) TowerPlaced -gt enter(hand, red,
green, blue), Put_1(red, green, blue) Solve
-gt state(InitialTower), MakeMoves,
state(FinalTower) MakeMoves -gt Move(block)
0.1 Move(block), MakeMoves 0.9 Move -gt
Move_1-2 Move_1-3 Move_2-1 Move_2-3
Move_3-1 Move_3-2 Move_1-2 -gt Grab_1,
Put_2 Move_1-3 -gt Grab_1, Put_3 Move_2-1 -gt
Grab_2, Put_1 Move_2-3 -gt Grab_2,
Put_3 Move_3-1 -gt Grab_3, Put_1 Move_3-2 -gt
Grab_3, Put_2 Grab_1 -gt touch_1,
remove_1(hand,) touch_1(), remove_last_1()
Grab_2 -gt touch_2, remove_2(hand,) touch_2(),
remove_last_2() Grab_3 -gt touch_3,
remove_3(hand,) touch_3(), remove_last_3()
Put_1 -gt release_1() touch_1,
release_1 Put_2 -gt release_2() touch_2,
release_2 Put_3 -gt release_3() touch_3,
release_3

Representation
Stochastic grammar
Parser augmented with parameters and internal
scene model

8
Forming the Symbol Stream

Domain independent blob interactions converted
to terminals of grammar via heuristic domain
knowledge
Examples merge (x 0.33) ? touch_1
split (x 0.50) ? remove_2
Grammar rule can only fire if internal scene
model is consistentwith terminal
Examples cantremove_2 if nodiscs on peg 2 (B)
Cant move disc tobe on top of smallerdisc (C)

9
ToH Example Frames
Explicit noise detection
Objects recognized by behavior, not appearance
10
ToH Example Frames
Detection of distracter objects
Grammar can fill in for occluded observations
11
Finding the Most Likely Parse

Terminals and rules are probabilistic
Each parse has a total probability
Computed by Earley-Stolcke algorithm
Probabilistic penalty for insertion and deletion
errors
Highest probability parse chosen as best
interpretation of video

12
Expectation Grammars Summary
Semantic Reasoning Stochastic Parser
Feedback
Sensory Input Video
Pre-conceptual Reasoning Object IDs
Action Report Best Interpretation
Memory Parse Tree
Pre-processing Blobs Interaction Events
Given Knowledge Grammar, Scene Model Rules
Learning None (Bg)
13
Contributions

Showed activity recognition and decomposition
without appearance models
Demonstrated usefulness of feedback from
high-level, long-term interpretations to
low-level, short-term decisions
Extended SCFG representational power with
parameters and abstract scene models

14
Lessons

Efficient error recover important for realistic
domains
All sources of information should be included
(i.e., appearance models)
Concurrency and partial-ordering are common, thus
should be easily representable
Temporal constraints are not the only kind of
action relationship (e.g., causal, statistical)

15
Representational Issues