Title: Video Summarization by Spatial-Temporal Graph Optimization
1Video Summarization by Spatial-Temporal Graph
Optimization
Shi Lu, Michael R. Lyu, Irwin King Department of
Computer Science and Engineering The Chinese
University of Hong Kong, Shatin, N.T. Hong Kong
SAR. slu, lyu, king_at_cse.cuhk.edu.hk
Finding a desired video in a large digital
library is a tedious work for it is time
consuming to download and browse through the
whole video. To facilitate the user, in this
poster we present a novel scheme to generate
short summaries for longer video documents. To
ensure the quality and flexibility of the video
summary, we model the video into a graph, and
select the summary shots by dynamic programming.
An experimental system has been developed.
- The system consists of the following modules as
shown in Figure 1 - The video preprocessing module is responsible for
detecting video shot boundaries and the
distribution of important video features - With the preprocessing results, we can determine
the candidate video shot set. - The summarization module generates video summary
according to the users requirements by graph
optimization
- Candidate video shots selection
- Detect some important features on the time line
human face, human voice, piercing noise
(gunshot/explosion), fire color, etc. - Detect video shot breaks by some video
segmentation methods - Video shots with one or more important features
are selected as candidates - Model the candidate shots into a graph
- The graph is a directional completed graph, as
shown in Figure 2 - Each vertex corresponds to a video shot, with a
weight equals to the shot length - Each edge has a weight that combines the visual
similarity and temporal distance between each
shot pairs the edge direction is the shots
temporal order - Select video skimming by optimization on the
graph - Objective achieve visual diversity and temporal
coverage given the summary length - Search the longest path in the graph, with the
constraint that the vertex weight summation of
the path is within the given summary length L - The constrained longest path in the graph can be
found by dynamic programming
Cluster2
- User test
- Objective evaluation method for video summary
quality is still unavailable - We invite 10 people to watch several video
summaries generated from several videos with
compression rate 0.15 and 0.30 - Each test user will answer questions about the
content of the video - Who? (about the main actors) and What? (about
the key events) - Two scores are calculated from the answers
(scaled to 10) - Results shown in Table 1
Department of Computer Science and Engineering
The Chinese University of Hong Kong