Title: Introduction of Scalable Video Coding
1Introduction of Scalable Video Coding
2Outline
- What is Scalability
- Drawbacks of traditional scalable video coding
- Open-Loop Motion Prediction Video Coding
- Current SVC status
3What is Scalability
- For video over non-guaranteed QoS (Quality of
Service) networks, we want to provide best
service for all consumers
4Scalability
- SNR scalability
- Different given bit-rate
- Temporal scalability
- Display different frame-rate
- Spatial scalability
- Display different video-size
5How to Perform Scalable
- One bitstream, multiple adaptions
CIF 15fps
D1 30fps
D1 30fps
QCIF 15fps
6Traditional Scalable Video Coding
- MPEG-4 FGS (Fine Grain Scalability Profile)
- Simple or Advanced Simple Profile as the
base-layer - Supports various layers of SNR enhancements
- Combines with temporal scalability
- Closed-loop video coding
7FGS Scheme
- Base layer lower bound
- Provide enhancement layer to meet the upper bound
8MPEG-4 FGS
9Performance of FGS
10Drawbacks of FGS
- Closed-loop video coding
- Error propagation
- Reference frame is reconstructed frame
- DPCM loop
- Reference frames in encoder and decoder are the
same - Weak scalability
- PSNR penalty of FGS scalability
11Open-loop Motion Prediction Concept
- Advantage of open-loop video coding
- Provide the drift-free scalability
- Temporal scalability
12Motion-Compensated Temporal Filtering (MCTF)
For example, temporal level 3
spatial level 3
13MCTF Haar filter
- L(m,n)B(m,n)A(mk,nl)/2
- L(m,n)B(m,n) if multiple connected
- H(m,n)A(m,n)-B(mk,nl)
- H(m,n)A(m,n)-A(m,n)
- if unconected
unconnected
multiple connected
14MCTF
- Can be extended to longer filter, ex 5/3 filter
- Basic idea is still block displacement
- Only integer-pixel accuracy can perfect
reconstruct (PR) in traditional MCTF
15Why sub-pixel is not PR?
- PR only if you can get the interpolated A and B
pixels, which is unavailable - So, only 2-level temporal DWT can be performed
Closest integer to dm
W is interpolation filter
16Performance of MCTF
- Lifting v.s. Invertible MC v.s. Integer pixel
- Entropy coding should be 3DSB-FSSB, not EZBC
17Motion-Compensated Temporal Lifting (MCTL)
- Perfect reconstruct in any fractional pixel
accuracy - Invertible because of lifting scheme
- Ex. Haar filter
W(a-gtb) warping a frame to b frame
Inverse filtering
18MCTL
W is interpolation filter
19MCTL Using 5/3 Filter
20ABSMA vs. H.264
21Current Wavelet Video Coding
- T2D MCTF
- Temporal decomposition first
- Weak in spatial scalability
- 2Dt MCTF ( In-band MCTF)
- Spatial decomposition first
- Weak in temporal scalability
- 2Dt2D MCTF
- Hybrid method
22In-band MCTF Scheme
23(2Dt2D) Structure for MC 3D-DWT
24Scalability for 3D-DWT Video Coding
- Scalability
- SNR scalability
- 3D EBCOT coding
- Temporal scalability
- Spatial scalability
- Other topics scalable motion vector
Transforms in the encoder
Transforms in the decoder
Half frame-rate
¼ frame-size
25Current SVC status
- Core Experiment 1
- Adaptive Block-size Motion Alignment by Microsoft
Research Asia - Core Experiment 2
- H.264 (base layer) MCTF (enhancement layer)