Toward Learning MixtureofParts Pictorial Structures - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Toward Learning MixtureofParts Pictorial Structures

Description:

Overview OSU Digital Scout Project. Describe problem of initial formation labeling ... Video shot by single fixed location at top of Reser stadium ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 38
Provided by: edw117
Category:

less

Transcript and Presenter's Notes

Title: Toward Learning MixtureofParts Pictorial Structures


1
Toward Learning Mixture-of-Parts Pictorial
Structures
  • Robin Hess and Alan Fern

School of Electrical Engineering and Computer
Science Oregon State University
2
Talk Objectives
  • Overview OSU Digital Scout Project
  • Describe problem of initial formation labeling
  • Representational and inference challenges
  • Mixture-of-Parts Pictorial Structures
  • Model definition
  • Inference
  • Opportunities for learning
  • Parameters and structure
  • Speedup Learning
  • Active Learning
  • Transfer Learning

3
The OSU Digital Scout Project
Objective compute semantic interpretations of
football video
High-level interpretation of play
Raw video
  • Professional/college teams spend many hours
    attaching semantic tags to video for DB access
  • We want to make this process much more automatic
  • Support computer assisted strategic analysis of
    opponents

Previous Work S. Intille. Visual Recognition
of Multi-Agent Action. PhD Thesis, MIT, 1999.
4
Raw Video Data
  • Obtained several games worth of home field video
    from OSU football team
  • Once video file per play
  • Exact same video used by coaches
  • Video shot by single fixed location at top of
    Reser stadium
  • Camera is constantly panning and zooming

5
Registered Video Data
  • Semantic interpretation requires registration of
    video data to football field coordinates
  • Developed robust registration approach Hess
    Fern, CVPR07

planar homography
6
Problem Formation Labelling
  • We consider a subproblem of full play
    interpretation
  • Given initial registered video frame of a play
  • Output offensive formation
  • types and locations of 11 offensive players

player locations types
Thousands of possible formations
7
Challenges in Formation Labelling
  • Player appearances nearly identical
  • Appearance not useful for inferring player type
  • Difficult to robustly segment individual players
  • part detector style approaches are difficult to
    apply

8
Challenges in Formation Labelling
Different formations can differ in subtle ways
9
Problem Constraints
  • A number of hard constraints imposed by rule book
  • Exactly 11 players
  • Exactly 7 players on line and 4 players behind
    line
  • Exactly 1 quarterback and 1 center
  • Location of center is at midfield or hash line

10
Problem Constraints
  • Soft constraints on relative spatial locations of
    players
  • Constraints strongly depend on the set of player
    types

11
Previous Attempt
S. Intille. Visual Recognition of Multi-Agent
Action. PhD Thesis, MIT, 1999.
  • Intille used KB of hard constraints to cast as a
    SAT-like problem
  • Constraints near, to the left of, bit of
    vertical space between, etc.
  • Simplified problem by hand-labelling the field
    locations of the 11 players
  • Only tried to infer player types
  • Failed to get the approach to work well and was
    abandoned in previous work

12
Structured Output Representations
  • Infer type location for all of 11 players
  • ti ?QBS, QB, C, LG, RG, LTE, . . . , 34 types
  • li ?(0,0),(0,1),, (n,m), pixel location
  • Our representation must capture
  • Hard joint constraints among types
  • Soft joint constraints among locations
    conditioned on types and image data

22 output variables
  • Possible to encode constraints via standard
    discrete factor-graph models (e.g. CRFs, weighted
    CSPs, ILP, etc.)
  • Such encodings appear problematic wrt
    off-the-shelf inference techiques (?)
  • Domains of variables are huge many values
  • Large factors (e.g. exactly 7 line type
    players)
  • Location constraints are inherently numeric

13
Pictorial Structures
  • Offensive formations can be viewed as multi-part
    articulated objects (parts correspond to players)
  • Pictorial structure models have been successful
    for multi-part objects in computer vision
  • Local part appearance models
  • Deformable connections
  • Joint estimation of part locations

node values are part locations
simply pairwisegraphical models
Courtesy Fischler Elschlager
14
(No Transcript)
15
  • When edge structure forms a tree can use DP to
    compute map in O(nh2) time
  • n - of parts, h - of pixels
  • h2 is often impractical
  • If in addition dij(. , .) is a Mahalanobis
    distance then can do computation in O(nh) time!

16
Pictorial Structures for Football
  • For a fixed set of player types, locations can be
    well approximated by pictorial structure
  • But part sets (i.e. player types) varies across
    plays
  • Cant use standard pictorial structures for our
    problem
  • Can we still leverage benefits of pictorial
    structures?

17
Mixture of Parts Pictorial Structures (MoPPS)
  • Captures constraints on legal part sets via pv
  • Captures spatial constraints among parts via f

18
MoPPS Inference
  • Find MAP estimate of most likely set of parts and
    their locations
  • Worst case evaluate pictorial structure of each
    legal part set
  • Requires over an hour of processing for our
    problem
  • Need a structured MoPPS representation that can
    be exploited for fast inference
  • We use a MoPPS Tree

19
MoPPS Tree Representation
  • Pictorial structure for a legal part set is
    projection of global tree onto part set

20
MoPPS Tree for Football
  • 34 parts in model (one for each possible player
    type)
  • Includes local observation models
  • Includes pairwise spatial constraints
  • Also provide constraints for evaluating legal
    part sets

21
MoPPS Tree Inference
  • Becomes combinatorial optimization over legal
    part sets
  • We use Branch-and-Bound Search (BBS)

22
Branch-and-Bound Search
  • Search nodes are part sets
  • Internal nodes represent sets of legal part sets
  • Leaves are legal part sets
  • While solution not found
  • Expand least node according to ordering relation
  • Computer upper and lower bound
  • Prune any dominated node

23
Lower Bound Computations
  • Monotonicity adding to a set of parts will never
    result in reduced cost
  • Simply compute pictorial structure match of tree
    projected on parts in search node
  • Can improve on this by adding cost for missing
    parts

24
Upper Bound Computations
  • Match entire MoPPS tree to image data
  • Use as a heuristic for quickly finding legal
    completion of current part set
  • Cost of completion is upper bound

25
MoPPS Tree Parameters for Football
  • 34 parts, 3200 legal formations
  • 16 basic player types plus subtypes
  • Connections modeled as Gaussian overideal
    location relative to parent player
  • Parameters manually set using training images
  • Observation model uses two independent components
  • based on background
    model
  • based on color
    histogramming

26
Background Model
  • Register lots of video to field model
  • Learn kernel density estimate of color at each
    pixel

27
(No Transcript)
28
(No Transcript)
29
Results
30
Anytime Behavior Correct
  • Exhaustive search requires close to an hour
  • Greedy search is fast but achieves only 80
    accuracy
  • Mean-squared location error less than a yard

31
Directions Learning MoPPS Models
  • Successfully hand-coded a MoPPS model
  • Was quite time consuming to get parameters right
  • Motivates supervised structure and parameter
    learning
  • MoPPS model takes average of 4 minutes per play
  • Still too slow for weekly volume of game video
  • Motivates speedup learning
  • MoPPS model will sometimes need to be
    relearned/adapted to different sets of video
  • Want to reduce labelling effort
  • Motivates active and transfer learning

32
Structure and Parameter Learning
  • Goal learn structure and parameters of MoPPS
    tree from labelled data
  • Assume hard constraints on legal part sets
    provided
  • There are algorithms for learning the structure
    of pictorial structures
  • Can easily modify to learn MoPPS tree
  • Easy to combine with generative parameter learning

33
Structure and Parameter Learning
  • Issue pure generative parameter learning will
    not likely be sufficient
  • Hand-coded model incorporate reward terms to
    make up for deficiencies in generative
    observation model
  • Suggests augmenting generative model with
    discriminatively trained components
  • Issue inference time of 4 minutes makes most
    generative training methods quite expensive
  • Suggests using approaches that do not perform
    full joint inference for each parameter update

34
Speedup Learning
  • How can we speedup branch-and-bound search?
  • There are a number of interesting settings
  • Setting 1
  • Given a MoPPS model upper/lower bound functions
  • Learn an effective search space operators
  • Setting 2
  • Given a MoPPS model search space
  • Learn more accurate upper/lower bound functions
  • Setting 3
  • Given a MoPPS model search space possibly
    bounds
  • Learn an effective priority queue ranking
    function

35
Active Model Calibration
  • Want to minimize labelling effort for new video
    set
  • Active learning and/or semi-supervised
  • Want to leverage experience with previous videos
  • Transfer learning
  • How can we combine these two paradigms for label
    efficient active model calibration?
  • User interface is also critical
  • Very rough idea
  • Assume fixed model structure
  • Learn prior on parameters from previous data sets
  • Use prior for regularization and example
    selection

36
Summary and Future Work
  • New structured output challenge problem
  • We will provide labelled data set
  • Can off-the-shelf structured learning approaches
    work
  • Suggests investigating lesser studied directions
  • Speedup learning
  • Active calibration
  • On the horizon
  • Applying to defensive formations
  • Full temporal play interpretation
  • Mining strategic knowledge
  • Strategic planning

37
The
Digital
Scout
Project
http//eecs.oregonstate.edu/football
Write a Comment
User Comments (0)
About PowerShow.com