Title: 3D MOTION ESTIMATION FOR 3D VIDEO CODING
1ICASSP 2012, Japan
3D MOTION ESTIMATION FOR 3D VIDEO
CODING Authors Manoranjan Paul, Junbin Gao and
Michael Antolovich Affiliation
School of Computing and Mathematics
Experimental Results and ConclusionsComputational
Time Fig 5 reveals that the proposed methods
reduce around 60 of the computational time
compared to the H.264/MVC. Rate-distortion
Performance The proposed scheme outperforms
H.264/MVC by 0.25 dB PSNR. Interactivity The
proposed method supports interactivity by
removing random access frame delay.
Abstract Existing multiview video coding (MVC)
technologies are not sufficiently agile to
exploit the interactivity and they are also
inefficient in terms of image quality and
computational complexity. In this paper a novel
technique is proposed using 3D motion estimation
(3D-ME) to overcome the problems. Another
technique is also proposed in the paper where a
dynamic background frame (the most common frame
in a scene i.e., McFIS) is used in the 3D-ME
technique to improve the performance for occluded
areas. Motivation and Problem
Statement Fig 1 shows the prediction structure
of the MVC recommended by the H.264/MVC standard.
A frame may use 4 reference frames and
encoding/decoding a frame sometimes requires
encoding/decoding a number of frames in advanced.
Thus, interactivity, two way communications,
and VCR functionalities are impossible/limited in
the existing prediction structure. Fig 2 shows
the average similarity of motion vectors among
different views. The figure confirms that the
similarity is 51 to 93, which motivates us to
encode all views simultaneously using 3D-motion
estimation. Proposed 3D Motion Estimation In
the proposed 3D-ME technique, we can make a 3D
frame comprising i th frames of all views and ME
can be carried out for a 3D macroblock (another
dimension is formed using co-located macroblocks
from different views) where the reference 3D
frame would be formed using the immediate
previous i.e., (i-1)th frames of all views or
(i-1)th McFISes for all views.
Fig 4 Examples of McFISes and uncovered
background using Vassar and Ball Room video
sequences, in upper row original frames of Vassar
and Ball Room sequences respectively in lower
row corresponding McFISes respectively.
Fig 5 Average computational complexity reduction
by the proposed methods (3D-ME and 3D-ME-McFIS)
against the H.264/MVC using four video sequences
where search length 15 and 31 are used.
Fig 6 Rate-distortion performance by H.264/MVC
and the proposed schemes (3D-ME and 3D-ME-McFIS)
using four standard video sequences namely Exit,
Ball Room, Vassar, and Break Dancing
Contact details Manoranjan Paul Phone 61 2
6338 4260 Email mpaul_at_csu.edu.au
Fig 3 (b). Formation of 3D block