Title: Temporal Segmentation of Video Objects for Hierarchical Object-Based Motion Description
1Temporal Segmentation of Video Objects for
Hierarchical Object-Based Motion Description
Zhuofu Xiao zxiaoa_at_sfu.ca June 26, 2002
2Topics
- Introduction
- Hierarchical Object-Based Motion Description
- Temporal Segmentation And Description of Object
Motion - Identification And Description Of Object
Interaction - Application and Experimental Results
- Conclusions
- References
3Introduction
- This paper describe a hierarchical approach
for object-based motion description of video in
terms of object motions and object-to-object
interactions. - Describe object motion by elementary motion units
(EMU), action units (AU), elementary reaction
units (ERU), and interaction units (IU) - Use dominate Affine Motion Parameters segment the
lifespan of a video object into EMUs
4Introduction
- In most of the existing systems the approach is
- Segment a video into shots
- Select key frames
- Characterize objects properties by
- Spatial features (color, texture, shape,
etc) - Temporal features (object motion, variation
of object shape) - Other time variant features
- objects interaction or temporal segment
of object motion
5Introduction
- Elementary Motion Units (EMU)
- A set of consecutive frames within which the
dominant motion of the object can be represented
by a single parametric model. - Elementary Reaction Units (ERU)
- A set of consecutive frames, within which two
video objects have a predefined type of
interaction. - Action Units (AU) Interaction Units (IU)
- An AU is a time-ordered sequence of EMUs,
- An IU is a time-ordered sequence of RMUs.
6Hierarchical Object-Based Motion Description
- Low-level motion too complex to describe at the
segment level - So divide into smaller temporal unit EMU ERU
- Low-level motion exhibited a strong correlation
between the pairs of frames - So we assign a parametric (affine) motion model
for each EMU - And an interaction type for each ERU
- object boundaries, object position and
object motions
7Hierarchical Object-Based Motion Description
- Humans interpret and describe motions at the
semantic level - Action Unit
- a time-ordered sequence of EMUs carrying a
semantic meaning - Interaction Unit
- a ordered set of sequence of ERUs
corresponding to a semantic-level interaction.
8Hierarchical Object-Based Motion Description
- Object-based segment
- A selected occurrence of a set of objects between
a begin frame and an end frame. - Foreground objects and background object(s)
- Low-level and high-level description of object
motion and interaction
9(No Transcript)
10(No Transcript)
11Hierarchical Object-Based Motion Description
- Overview of method to compute a description of a
video - Detect or select occurrences of the objects of
interest in the video - Partition the life-span segment of the object
into EMUs and ERUs, and compute the appropriate
descriptors - The EMUs and ERUs are grouped into AUs and IUs.
12(No Transcript)
13Temporal Segmentation And Description Of Object
motion
- Low-Level Object Motion Description Elementary
Motion units - Parametric Model Fitting Between Successive Pairs
of Frames - Computation of Dissimilarity measure
- Computation of the Representative Motion Model
- Computation and Indexing of Background motion
14Temporal Segmentation And Description Of Object
motion
- Parametric Model Fitting Between Successive Pairs
of Frames - Compute Affine Motion Model describing object
motion between each adjacent pair of frames
15Temporal Segmentation And Description Of Object
motion
- Parametric Model Fitting Between Successive Pairs
of Frames - Compute a confident measure
16Temporal Segmentation And Description Of Object
motion
- Computation of Dissimilarity measure
17Temporal Segmentation And Description Of Object
motion
- Computation of Dissimilarity measure
- Dissimilarity threshold Q
- DSIM gt Q, Divide EMU in the middle
- DSIM lt Q, Merge two EMUs
- Confidence measure CONb
- for manual checking
18Temporal Segmentation And Description Of Object
motion
- Computation of the Representative Motion Model
- One of the affine model within the EMU
- Robust to outlier within the EMU
- Associate a Confidence measure CONR
19Temporal Segmentation And Description Of Object
motion
- EMU E is described by
- the begin, middle and end frame numbers
- The representative dominant affine motion
parameter - The trajectory of the object centroid
- A thumbnail visual representative
20Temporal Segmentation And Description Of Object
motion
- Computation and Indexing of Background motion
- To recover the absolute motion of foreground
object, we must first perform camera motion
compensation - A parametric motion model is selected to
represent background motion. - We used a variation of the automatic dominant
camera motion annotation method to achieve
21Temporal Segmentation And Description Of Object
motion
- Nonsingular components dominate
- camera rotation variance of the magnitude of
the motion vector greater than a threshold - camera translation smaller than the threshold
- Singular components dominate
- 2D affine model of the background motion
-
- Matrix A is projected along
- Z-rotation projection along I2 gt I1
- Z-translation projection along I2 lt I1
-
22Identification and Description Of Object
interaction
- Low-level interaction types
- Object Boundaries
- Coexistence, Physical Contact, Occlusion
- Object Position
- - Directional Relations (north, north-east,
above) - - Topological Relations (equal, inside)
- Object Motions
- Approach, Diverge, Stationary
23Identification and Description Of Object
interaction
- ERU Identification
- Identify the type of low-level interaction
between every pair of objects at each frame, - All consecutive frames with the same type are
merged to a final ERU segments - ERU Description
- ERU is described by two object identifier, start
and end frame numbers, the interaction type and
interaction specific descriptor.
24Applications And Experimental Results
- Segment a video sequence into EMUs ERUs
- EMU indexing and retrieval
- An object motion/interaction description graph
25Applications And Experimental Results
- Seven indoor sequences
- Children sequence
- Playboy sequence
- results
26Conclusions
- We described an object-based video description
hierarchy - Provided automatic algorithms that exact the
low-level element - Demonstrated examples on automatic segmentations
and EMU retrieval
27References
- L. S. Shapiro, "Affine Analysis of Image
Sequences", Cambridge Univ. Press, Cambridge,
U.K., 1995 . J. Y. A. Wang, E. H. Adelson,
"Representing images with layers ", IEEE Trans.
Image Processing, vol.3, pp.625 -628, Sept. 1994.
- Y. Altunbasak, P. E. Eren, A. M. Tekalp,
"Region-based parametric motion segmentation
using color information", Graph. Models Image
Processing, vol.60, no. 1, pp.13-23, 1998. - C. S. Regazzoni, A. Teschioni, "A new approach to
vector median filtering based on space filling
curves", IEEE Trans. Image Processing, vol.6,
pp.1025-1037, July 1997. - J. Astola, P. Haavisto, Y. Neuvo, "Vector median
filters", Proc. IEEE, vol.78, no.4, pp.678-689,
1990. - G. Sudhir, J. C. M. Lee, "Video annotation by
motion interpretation using optical flow
streams", J. Vis. Commun. Image Represent.,
vol.7, no. 4, pp.354-368, 1996. - Y. Fu, A. T. Erdem, A. M. Tekalp, "Tracking
visible boundary of objects using occlusion
adaptive motion snake", IEEE Trans. Image
Processing, vol.9, pp.2051-2060, Dec. 2000.