Title: MPEG-7 Visual part of eXperimentation Model
1MPEG-7 Visual part of eXperimentation Model
- Presented by Moustafa A. Hammad.
2Introduction
- MPEG-7 Multimedia content description
interface. - A Quick review
- Feature, Descriptor, Descriptor value,
Description Scheme, Description Definition
Language (DDL) - Visual Elements ( Color, Spatial Structure,
Shape, Motion) - Representation of descriptors using DDL ( MPEG-7
DDL has been approved, October 99 )
3Topics of Discussion
- Visual Elements Descriptors
- Color
- Color Space
- Dominant Color
- Color Histogram
- Spatial Structure
- Grid Layout
- Shape
- Object Bounding Box
- Motion
- Camera Motion
- Object Motion Trajectory
4Color
- Color Space
- Like
- RGB, YCrCb, HSV, linear transformation matrix
with reference to RGB - Syntax
- Color_Space_Descriptor
- Enum description_color_space rgb, ycrcb, hsv,
linear_matrix - If (description_color_space linear_matrix)
- Int color_trans_mat33
-
5Color (Cont.)
- Dominant Color (DC)
- To specify set of DC in a shaped region.
- Syntax
- Dominant_Color_D
- int Dominant_Colors_Number
- struct Dominant_Color
DCsDominant_Colors_Number - Int Confidence_Measure
-
- struct Dominant_Color
- int Color_ValueColor_Space_Dimension
- Int Percentage
-
- Use color histogram (to get dominant colors (non
normative part) - Content-based retrieval.
6Color (Cont.)
- Color Histogram
- Denotes percent of each color in image object.
- Syntax
- Color_Histogram_Descriptor
- int histogram_norm_factor
- int number_histogram_bins
- int histogram_valuenumber_histogram_bins
Histogram_value
Histogram_bin
7Spatial Structure
- Grid Layout
- Grid_Layout
- int PartNumberH
- int PartNumberV
8Shape
- Object Bounding Box
- Bounding_Box
- Enum LengthUnits //Picture height, meters.
- double BoxHeight
- double BoxWidth
- double BoxDepth
- double FractionalOccupancy
- boolean Is3D
- If (HasCompositionInfo0
- double BoxCentreH
- double BoxCentreV
- double ?
-
- If (Is3D)
- double BoxCentreD
- double ?
-
DAR h/w
9Shape (Cont.)
- Extraction for 2D objects (non normative part)
- Matching/Query process (non normative part)
- Find the 2D objects whose aspect ratio (,gt,lt) a
certain value or within a certain range. - Find the 2D or 3D objects whose sizes are similar
to the one in this object or (,gt,lt) a certain
vale or within a certain range. - Find the object that are positioned near (x,y) or
(x,y,z) location in the picture/3D world. - Find the object whose height/width/depth are
(,gt,lt) a certain value or within a certain
range.
ObjectID of interest
Identifying samples belonging to object of
interest
Bounding Box Estimation
Image
Segmentation map
Bitmap of the object Of interest
Segmentation
Aspect Ratio
10Motion
- Camera Motion
- Camera operations
- (8 well known operation) The operations in the
figures - (zooming, change of the focal length) and Fixed
- Extract
- (camera motion parameter information)
- Sub shots
- - (frames characterized by type (s) of camera
motion) - - mixture or non-mixture
- Sub shots are the building blocks of this
descriptor
11Motion (Cont.)
- Fractional_Presence
- float TRACK_LEFT
- float TRACK_RIGHT
- float BOOM_DOWN
- float BOOM_UP
- float DOLLY_FORWARD
- float DOLLY_BACKWARD
- float PAN_LEFT
- float PAN_RIGHT
- float TILT_UP
- float TILT_DOWN
- float ROLL_CLOCKWISE
- float ROLL_ANTICLOCKWISE
- float ZOOM_IN
- float ZOOM_OUT
- float FIXED
- Syntax
- Camera_Motion_Descriptor
- Int
Num_Segment_Description - Int
Description_Mode - Segmented_Camera_Motion InfoNum_Segment_Descrip
tion -
- Segmented_Camera_Motion
- TimeStamp start_time
- float duration
- Fractional_Presence presence
- Amount_of_Motion speeds
- float
FOE_FOC_Horizontal_Position - float
FOE_FOC_Vertical_Position - Â
Amount_of_Motion float TRACK_LEFT float
TRACK_RIGHT float BOOM_DOWN float
BOOM_UP float DOLLY_FORWARD float
DOLLY_BACKWARD float PAN_LEFT float
PAN_RIGHT float TILT_UP float TILT_DOWN floa
t ROLL_CLOCKWISE float ROLL_ANTICLOCKWISE float
ZOOM_IN float ZOOM_OUT
Example A shot represented as mixture mode of
duration 40 time unit
Shot
Num_Segment_Description 1 Description_Mode
1 // mixture mode Segmented_Camera_Motion
Info1
Start_time 0 Duration 40 Presence
0.25-0.25 0.32 - - - 0.25 - - - - 0.2 0.2
ltrest of attributesgt
12Motion (Cont.)
- Object Motion Trajectory
- Spatio-temporal localization of one
representative point of the object (such as a
centroid).
Object_Motion_Trajectory int
Camera_follows_object enum Coord_system
local, world I (Coord_system
world) Boolean local_to_world_parameters_known
if ( local_to_world_parameters_known) world_coo
rd_info world_params enum spatial_units
picture_height, meters boolean
Coords_are_3D int N_key_points double
Key_point_tN_key_points double
Key_point_xN_key_points double
Key_point_yN_key_points If ( Coords_are_3D) do
uble Key_point_zN_key_points boolean
Object_Always_Visible If ( ! Object_Always_Visible
) Object_is_visibleN_key_points 1
boolean Use_default_interp_only
If( ! Use_default_interp_only) If (
Coords_are_3D) Interval_3D_info
Intervals_3DN_key_points 1 else Interva
l_2D_info Intervals_3DN_key_points
1 Interval_3D_info interpolation_functi
on_info x_function interpolation_function_info
y_function interpolation_function_info
z_function Interval_2D_info interpolation_func
tion_info x_function interpolation_function_in
fo y_function Interpolation_function_info i
nt Function_ID if ( Function_ID gt 0) list of
parameters_values
f(t) fa va ( t ta ) f(t) fa va ( t
ta ) ½ aa (tb ta)
13Motion (Cont.)
- Extraction (non-normative part)
- Input object binary segmentation mask sequence.
- Output global motion information.
- Process
- Instantiate key points time instants in
description ( one per frame) - Calculate the mass center of the mask at each
frame. - Calculate speed and acceleration (z information
may be deduced by from the size variation) - Choose interpolation function
- If the object is moving eventually subtract the
global motion from the object motion if you want
the trajectory in the scene reference and not in
the camera reference. - We define object followed by the camera as
- Unmoving object at a position near from the
image center - Object having irregular displacement of little
amplitude around a position near the image
center - Matching
- The distance between two trajectory descriptors
D, D is - d(D, D) ?i ? ( (xi - x i )2 (yi - y i
)2 (zi - z i )2 ) / (?t i ) - ? ( (Vxi - Vxi )2 (Vyi - Vyi )2
(Vzi - Vzi)2 ) / (?t i ) - ? ( (axi - a xi )2 (ayi - ayi )2 (azi
- azi)2 ) / (?t i )
14Motion (Cont.)
- Example query
- Find the video frames in which object k is moving
to the right (left, up, down) - Soln
- For all pairs of frames contained within the
existance interval for object k compute the
motion vector - v(t2) pos(t2) - pos(t1)
- Where, pos(t) is the object spatial coordinate at
time t.
15Descriptor representation using MPEG-7 proposed
DDL
- ltDType name'speeds'gt
- ltattribute name'TRACK_LEFT' type'real'/gt
- ltattribute name'TRACK_RIGHT' type'real'/gt
- ltattribute name'BOOM_DOWN' type'real'/gt
- ltattribute name'BOOM_UP' type'real'/gt
- ltattribute name'DOLLY_FORWARD' type'real'/gt
- ltattribute name'DOLLY_BACKWARD' type'real'/gt
- ltattribute name'PAN_LEFT' type'real'/gt
- ltattribute name'PAN_RIGHT' type'real'/gt
- ltattribute name'TILT_UP' type'real'/gt
- ltattribute name'TILT_DOWN' type'real'/gt
- ltattribute name'ROLL_CLOCKWISE' type'real'/gt
- ltattribute name'ROLL_ANTICLOCKWISE'
type'real'/gt - ltattribute name'ZOOM_IN' type'real'/gt
- ltattribute name'ZOOM_OUT' type'real'/gt
- lt/DTypegt
- Â
- ltDType name'presence'gt
- ltsubDOf name'speeds/gt
- Camera motion descriptor
- ltDType name'CameraMotionD'gt
- ltattribute name'NumSegmentDescription'
typeinteger/gt - ltattribute name'DescriptionMode'
typeboolean/gt - ltseq minOccurs'0' maxOccursPar'NumSegmentDes
cription'gt - ltDTypeRef name'SegmentedCameraMotionD'/gt
- lt/seqgt
- lt/DTypegt
- Â ltDType name'SegmentedCameraMotionD'gt
- ltDType name'start_time' type'time'/gt
- ltDType name'duration' type'timeDuration'/gt
- ltDTypeRef name'presence'/gt
- ltDTypeRef name'speeds'/gt
- ltDType name'FOE_FOC_HorizontalPosition'
typereal/gt - ltDType name'FOE_FOC_VerticalPosition'
type'real'gt - lt/DTypegt