Title: Kai-Chao Yang
1Hierarchical Prediction Structures in H.264/AVC
2Outline
- Analysis of Hierarchical B Pictures and MCTF ICME
2006 - Multiple Description Video Coding using
Hierarchical B Pictures ICME 2007 - Rate-Distortion Optimization for Fast
Hierarchical Picture Transcoding ISCAS 2006 - All Related Researches
3Analysis of Hierarchical B Pictures and MCTF
- Heiko Schwarz, Detlev Marpe, and Thomas Wiegand
- ICME 2006
4Hierarchical B-Pictures (1/2)
- Key pictures
- Hierarchical prediction structures
- Dyadic structure
- Non-dyadic structure
GOP
GOP
GOP
Hierarchical prediction
Hierarchical prediction
Hierarchical prediction
IDR
I/P
I/P
I/P
5Hierarchical B-Pictures (2/2)
- Coding delay
- Minimum coding delay hierarchy levels 1
- Memory requirement
- Maximum decoded picture buffer (DPB) 16
- Reference picture buffering type
- Sliding window
- Adaptive memory control
- Memory management control operation (MMCO)
- 0 End MMCO loop
- 1 mark a Short-term frame as Unused
- 2 mark a Long-term frame as Unused
- 3 assign a Long-term index to a frame
- 4 specify the maximum Long-term frame index
- 5 reset
- Minimum DPB size hierarchy levels
Coding order
0
1
2
3
4
5
Frame buffer
Short-term frames
Long-term frames
0 1 2 N-2 N-1 N
New
Old
replace
Thomas Wiegand, Joint Committee Draft (CD),
Joint Video Team, JVT-C167, 6-10 May, 2002
6Coding Efficiency of Hierarchical B-Pictures
- QPk QPk-1 (k1 ? 41)
- Problem PSNR fluctuations
High spatial detail and slow regular motion
Fast and complex motion
7Visual Quality
- Comparison of visual quality
- Finer detailed regions of the background using
larger GOP sizes.
IBBP
GOP 16
8MCTF Versus Hierarchical B-Pictures
- Drawbacks of MCTF
- Open-loop encoder control
- Significant cost in update stage
9Multiple Description Video Coding using
Hierarchical B Pictures
- Minglei Liu and Ce Zhu
- ICME 2007
10Concept of Multiple Description Coding
- Multiple bit-streams are generated from one
source signal and transmitted over separate
channels
Decoded signal from S1
Decoder 1
Decoded signal from S1 and S2
MDC encoder
S1
Channel 1
Source signal
Decoder 2
Channel 2
S2
Decoded signal from S2
Decoder 3
MDC decoder
11The proposed architecture for MDC
- GOP size 8
- Two output streams (S1, S2) are generated
GOP
S1
i
i8
GOP
S2
i1
i9
Combination
i
i8
i1
i9
i3
i5
i7
i6
i4
i2
12Coding Efficiency (1/2)
- Improvement of coding efficiency
- Increasing QP values for higher layers
- Transmitting MVs only for higher layers
- Skipping frames at higher layers
13Coding Efficiency (2/2)
Max. QP 51 for highest level
Side distortion
Side distortion
Central distortion
Central distortion
14Rate-Distortion Optimization for Fast
Hierarchical Picture Transcoding
- Huifeng Shen, Xiaoyan Sun, Feng Wu, and Shipeng
Li - ISCAS 2006
15Rate Reduction Transcoding (1/3)
- Cascaded pixel-domain transcoding structure
- Fully decoding the original signal, and then
re-encoding it
A. Vetro, C. Christopoulos, and H. Sun, "Video
transcoding architectures and techniques an
overview", IEEE Signal processing magazine, March
2003.
16Rate Reduction Transcoding (2/3)
- Open-loop transcoding in coded domain
- Partially decoding the original signal and
re-quantizing DCT coefficients - drift
A. Vetro, C. Christopoulos, and H. Sun, "Video
transcoding architectures and techniques an
overview", IEEE Signal processing magazine, March
2003.
17Rate Reduction Transcoding (3/3)
- Closed-loop transcoding with drift compensation
- Partially decoding the original signal, and then
compensating the re-quantized drift data
A. Vetro, C. Christopoulos, and H. Sun, "Video
transcoding architectures and techniques an
overview", IEEE Signal processing magazine, March
2003.
18Hierarchical B Pictures Transcoding
- Open-loop transcoding method can be used
- Motion information is unchanged DCT coefficients
are truncated, re-quantized, or partially
discarded - Drift inside a GOP will not propagate to other
GOPs - However, motions are more important in
hierarchical B-pictures structure - At low bit-rate, most bits are spent on motion
information - Proposed RDO model combination of texture RDO
and motion RDO
19Traditional Rate-Distortion Model
- RD model
-
- S (S1, , Sk) denotes k MBs
- I (I1, , Ik) denotes k coding parameters of S
- Fully decoding and re-encoding is needed!
20Proposed Rate-Distortion Model (1/4)
- Proposed RD model
-
-
-
- Claim
-
- Rtexture rate spent in coding quantized DCT
coefficients - Rmotion rate spent in coding MB modes, block
modes, and MVs -
- Dtexture distortion caused by downscaled texture
with unchanged MVs - Dmotion distortion caused by motion adjustment
relative to the unchanged motion case
21Proposed Rate-Distortion Model (2/4)
- Texture RDO model
- To minimize the RD function,
- ?
- Let
- ?
- ? ?
-
N.Kamaci, Y. Altunbasak, and R.M. Mersereau,
"Frame bit allocation for the H.264/AVC video
coder via Cauchy-density-based rate and
distortion models", IEEE Trans. on CSVT, Vol 15,
No. 8, Aug. 2005.
2.54
-5.35
22Proposed Rate-Distortion Model (3/4)
- Motion RDO model
- Rmotion can be easily computed, but Dmotion is
unknow - Dmotion can be approximated by mv mean-square
error
A. Secker and D. Taubman, "Highly scalable video
compression with scalable motion coding", IEEE
Trans. on Image Processing, Vol. 13, No.8, August
2004.
23Proposed Rate-Distortion Model (4/4)
- Motion adjustment
- Original
- Adjustment
24Simulation results
25All related researches
- Rate control optimization
- Bit allocation
- Trade-off between coding efficiency and delay
- Multi-view
- Temporal scalable coding in SVC
- Elimination of PSNR fluctuation?
- More efficient hierarchical structures?