Title: Video Data Management Systems: Metadata and Architecture
1Video Data Management Systems Metadata and
Architecture
- Chapter 9 of Multimedia Data Management
- Using Metadata to Integrate and Apply Digital
Media
2????
- ??Video Data Management System?????,?????????????
- Good understanding of digital media
- Typical applications of digital media
- Types of queries
3????
- Introduction
- Video Data Management System (VDMS)
- Natural of Video Data
- Application of Video
- Challenges in Video Data Management
- ViMOD The Video Data Model
- Architecture for a Video Data Management System
4Introduction
- Video audio-visual temporal data
- Streaming data (temporally extended) with high
resolution and multiple channels - Video data management system (VDMS)
- Storage of video on computer systems
- Content based retrieval
- Real-time synchronized delivery of video
- Content based retrieval
- Data modeling (especially the metadata)
- Automatic extraction of data models
- Query processing and retrieval mechanisms
5Introduction (Cont.)
- Developing a (meta) data model for video requires
a good understanding of video as a media, the
typical applications of video and the types of
queries that will be encountered
6Video Data Management System (VDMS)
7What is a VDMS?
- A software system which provides
- Content based access to video data
- Audiovisual content of video (color, texture,
voice similarity) - Semantic content of video (topic of video,
persons in a scene) - Facilities
- Facilities provided by standard DBMS (insertion,
deletion, schema definition) - User interface for smooth interaction between the
user and the video data collection - Predefined set of query classes and an associated
query interface - Tools for navigation and manipulation video data
8Example Scenario Sporting Event VDMS
- Purpose
- Postgame analysis
- Plan strategies for future games
- Analyze game strategies of opposing teams
- Analyze the performance of players
- Scenario 1
- Question Remember the OSU game from last fall?
- Query Retrieve ltGamefootballgt ltSchoolOSUgt
ltYear1994) - Response The video is cued to the beginning of
the OSU game of 1994
9Example Scenario Sporting Event VDMS (Cont.)
- Scenario 2
- Didnt OSU score a field goal in the 3rd quarter
of the game? - Locate ltQuarter3gt ltPlayfield-goalgt ltTeamOSUgt
- The retrieved video is marked with the time
points of all field goal attempts
- Scenario 3
- Can we see a close up shot of this kick?
- Retrieve ltPlayfield-goalgtltShotClose upgt
- The database is searched for a close up shot and
the video is cued if the search is successful
10Example Scenario Sporting Event VDMS (Cont.)
- Scenario 4
- Lets look at the track of the kickers foot
- Tracking Mode. Using the interface, a bounding
box is placed around the kickers foot to
indicate the object to be tracked. - The system tracks the kickers foot through the
shot, and displays a track of the foot
- Scenario 5
- Lets see other kickers with similar kicks in
last years NCAA football - Similarity Search. ltYEAR1993gtltGameNCAA-footgtltPla
y field goalgt ltMatch-CriteriaIntra video object
location based matchinggt - The system searches through the NCAA games of
1993 for field goal attempts. Compare the
kickers tracks for attempts. Ranked set
11Nature of Video Data
12Content of Video What is in Video?
- Video is an audiovisual media of information
presentation - Semantic content
- Message or information conveyed by the video
- Criminal news story what, when, who
- Audiovisual content
- Video clips and audio signals
- Criminal news story associated sound track
- Distinction Amount of contextual information and
knowledge required to extract contents
13Content of Video
14Semantic Content
- Content extraction
- Need background knowledge
- Complex, manually require user interfaction
- Metadata
- Example
- Emotion, Classification
- Similar to manage textual information
- Access Finer grain than traditional library
- scenes, shot ( chapters and sections in books)
15Audiovisual Content
- Content extraction
- No Need background knowledge
- (Semi-)automatically
- Example
- Object recognition, object tracking over time,
temporal events recognition, word and sentence
recognition, unusual sound events - Camera and object motion, color and texture
properties, audio properties (like loudness,
pitch)
16Unique Characteristics of Video
- How is video different from other classes of
data? - Data classification
- Alphanumeric data generated from a finite set
of symbols - Essentially generated by human agency
- Example free text data, computer programs,
product data - Non-Alphanumeric data not derived from a finite
set of symbols - Generated by an instrument or sensor
- Images, speech signals, MRI data, Video data
17Unique Characteristics of Video (Cont.)
- Criteria for comparing alphanumeric and
non-alphanumeric data - Resolution the detail that the media provides
- Production process human agency vs. machine
agency - Ambiguity of interpretation measure the number
of interpretations derivable from the data - Interpretation effort measure the computational
effort required to interpret a given unit of
information - Data volume in terms of digital storage
- Similarity how to measure the similarity between
two units of information
18Comparison between Alphanumeric and
Non-alphanumeric data
19Applications of Video
20Introduction
- Identify the nature of queries which are used to
derive the design of the video data models - Example applications feature films, news videos,
sporting event videos, biomechanical analysis of
sports, building security videos - Analyzed from several perspectives
- Video intent What are the purpose of making the
video? - Provide clues into the video structure, content,
and organization - Video content What is the typical content?
- Depending on the domain of the video, the
predictability of the content varies
21Introduction (Cont.)
- Analyzed from several perspectives (Cont.)
- Video production How was the video made?
- Provide the clues into the syntactic structure,
and the audio-visual properties of the video data - Script control visualization of a certain
script? Audiovisual log? - Filming control environment, subject, video
filming parameters - Measure of the degree of control exercised by the
filmmaker on these parameters - Composition control
- Channel control audio and video channels
- Video usage dictate the queries that arise in
the database context
22Feature Films
- Video intent provide entertainment and convey
the message of the director to the audience - Video content
- A wide range of subjects (genre, like western
movies and war movies) and each subject can be
filmed in many different ways - Given a particular class, the content is
predictable - Video production a planned and controlled
process - Script control very high and very structured
- Filming control very high (location, action,
cinematography) - Composition control very high
- Channel control some visual orientation, others
aural information
23Feature Films (Cont.)
- Video usage
- Film viewer for entertainment
- List films with TitleX, ActorsY, DirectorsZ,
- List films with GenreWestern
- Film critics for evaluation (require finer
grain access) - Find scene where ActorX Emotioncry
- Find shot with camerastationary, Lens
actionsZoom in - Find scene with Special EffectMorphing
- Film Database Managers video rent, for
statistics - Number of rentals for TitleX, ActorY
- Average number of movies per customer per week
24News Video
- Video intent convey the news to the audience
- News events that occurred over a given duration
of time as observed by a certain team of people - Background information events (self-contained
and understandable) - Video content unstructured, but has a definite
presentation structure - Main Points ? segments (politics, sports) ?
anchor person reporter - Video production less controlled than a feature
film - Script control limited to the structure of the
news - The stories are controlled however, the exact
content of the stories and their presentation are
less controlled. SNG - Filming control studio environment (well
controlled) news location environment (less
controlled) - Composition recorded report vs. live reports
- Channel control more in the audio channel
25News Video (Cont.)
- Video usage
- News Browser at the granularity of news report
- Retrieve hockey events occurred between 1994 and
1995 - Retrieve results of 1992 elections
- News Producers and Reporters
- Interested in researching facts related to a
particular story - Reuse news for news report production
- Nomination of a new presidential candidate
highlight the persons life beginning from birth
26Sporting Event Videos
- Video intent a log of the sporting event
entertainment - Video content highly structured
- The structure in the game translates to structure
in the video - Large scale temporal structure predictable
- More detailed structure (plays, passes) unknown
- Video production comparable to that of news
videos - Script control video maker has no control of
the actual event - Filming control
- Game environment cannot be modified for video
- Subject of video is the progress of the game,
which cannot be controlled for the video - Cinematography can be controlled to a large extent
27Sporting Event Videos
- Video production (Cont.)
- Composition control controlled (game segments)
- Less than feature films and comparable to live
news reports - Channel control visually oriented
- Complete control of the information distribution
between the audio and video channels - Video usage
- Casual Viewer
- Locating game videos (like film viewers)
- Sports Coaches, Trainers
- Coaching teams, analyzing player performance,
game strategies
28Classification of Video Queries
29Query Type
- Semantic Query
- Require high level semantic recognition and
interpretation of the video content - Require metadata generated manually
- Find scene with ActorX EmotionCrying
- Audiovisual Query
- Require metadata generated automatically or
semi-automatically - Find shot with CameraStationary, Lens
ActionsZoom in
30Matching Required
- Exact match query
- Find scene with ActorX
- Similarity match query
- Find all shots similar to this shot
31Function
- Location queries locate video information with
the DB - Find scene with ActorX
- Point to the beginning of scenes with the videos
which contain actor X - Tracking queries track visual quantities within
the video - Track the ball through this shot
- Location of the ball in each of the frames in the
shot
32Temporal Unit Type
- Unit query complete units of video
- Find films with ActorX
- Subunit Query subunits of video
- Find scenes with ActorX
33Challenges in Video Data Management
34Issues in Managing Traditional Databases
- Data modeling design application specific data
representation which support a certain set of
queries - Data insertion introduce new data items into an
existing collection - Extract the necessary information for
instantiating a data model - For example, adding a new employee into an
employee DB - Data organization arrange data items with
reference to each other in the collection (data
indexing) - Choice of fields or features to be used for data
indexing and the choice for data structures for
indexing the DB - Data retrieval extract data item from the
collection - Formulation and processing of queries
35Managing Video Data
- Data model
- Film viewer queries limit to queries which
locate feature films - Need a high-level description of the video
maker, topic, actors - Film critic producer queries
- Require access to the video data at a granularity
which is finer than locating feature films - Require access to parts of a film like scene and
shot - Data model should support a segmented
representation for video - Biomechanical analysis queries
- Require the partitioning of video based on
different portions of an object track - Data model should support representation of raw
data features like locations of objects over a
period of time
36Managing Video Data (Cont.)
- Data insertion Related to the granularity of
video data - Film viewer video data metadata like title,
actors, directors, - Film critics and analyst require some form of
automation - Segment the video data based on the suitable
criteria - View each segment to extract the necessary
details (description) about the segment - Annotation and logging of the video segment
- Data organization
- Film viewer use title, actors, directorsas
search key - Paths of objects over a segment measure the
distance between two paths - Data retrieval Interface for presenting query
and video data
37Requirement Summary for Video Data Model
- A notion of time
- A segmented representation for time intervals
- A relationship between time intervals
- A set of descriptions associated with each time
interval
38ViMOD The Video Data Model
39Video Data Model
- The basic unit of video is a temporal interval
- V
- Video Interval tb, te
- Temporal Relations R
- R((r1,v1), (r2,v2), , (rk,vk))
- Feature Count n
- Feature Type (?1, ?2,, ?n)
- Feature (F1, F2, F3,, Fn)
40Segmentation Criteria
- The basis on which a particular interval of the
video can be chosen - Grouping of criteria
- Syntactic segmentation criteria
- Domain independent
- Example Shot (an image sequence generated by a
single operation of the camera) - Semantic segmentation criteria
- Domain specific
- Example Anchor-person segment, News-reporter
segment
41Segmentation Criteria (Cont.)
42Interval Relationships
43Definition of Features
- A feature provides information about a video
interval. A feature has associated with a feature
type ?.
44Feature Classification Criteria
- Content Dependence
- Independent the feature is not directly
available from the video data - Meta features
- Example Budget of a video
- Dependent
- Data features
- Example Story
- Temporal Extent
- Image based on viewing a single frame
- Example dominant color
- Video based on a time interval
- Example Feature track
45Feature Classification Criteria (Cont.)
- Labeling
- Domain model based labels
- Qualitative features (Q-features)
- Example in basketball pass, dribble, dunk
- Low-level domain independent models
- Raw features (R-features)
- Example object trajectories
46Type of Video Features
47Feature Type Classification in ViMod
48Meta Features
- Content independent features of video
- In general, apply to a complete video
- Examples
49Video Q-Features
- Content dependent, temporally extended, labeled
features - Has a value belonging to a finite set of labels
- Low level property
- Cinematographic properties
- Higher level properties
- Time frame, point of view
50Video Q-Feature Examples
51Video R-features
- Content dependent, temporally extended, raw data
values - Usually a set which is indexed by time
- Tracks of object motions within a video shot,
variations in lighting over time, variations in
audio level over time
52Image Q-Features
- Content dependent, single frame, labeled features
- Refer to a single instant of time in the video
shot - Has a value belonging to a finite set of labels
- Usually describe the video that do not change
over the time interval of the video - A White House video ? based on a single frame, it
is possible to recognize the building - Low level property
- Higher level properties
53Image Q-Feature Examples
54Image R-Features
- Content dependent, single frame, raw feature
values - Raw image measurements made from frames in the
video sequence
55ViMOD Architecture
56ViMOD Architecture
- Video server
- Database interface
- Metadata store
- Query processor
- Insertion module
- User interface
57Block Interactions
- Data insertion operation
- Database Interface
- Metadata store
- Insertion module
- User interface
- Data retrieval operation
- Query processor
- User interface
- Database interface
- Metadata store
58(No Transcript)
59??
- ??????????????????,??
- ????????
- ?????????
- ???????????(Query)
- ???????????????????????