Title: Introduction to H'264 Video Standard
1Introduction to H.264 Video Standard
- Anurag JainTexas Instruments
2H.264 Background
- Jointly developed by ITU-T and MPEG.
- Upto 50 more efficient at the same virtual
quality compared to MPEG-4 ASP - Supports wide range of applications. (interlaced,
progressive, low bit-rate, studio quality digital
cinema etc). - Multiple profiles (Baseline, Main, Extended,
High, FRExt). - Good results obtained from interoperability tests
making it suitable for wide deployment in short
span of time.
3H.264 Encoder Block Diagram
Quantization step more resolution for finer
control of bit rate
Intra Prediction Modes 9 4x4 4 16x16 modes 13
modes
Single Universal VLC and Context Adaptive VLC
OR Context-Based Adaptive Binary
Arithmetic Coding
- Seven block sizes and shapes
- Multiple reference picture selection
- 1/4-pel motion estimation accuracy
- Referenced B-frames
Integer 16-bit fixed point transform with no
mismatch
4Common Elements
- Common elements with other standards
- Macroblocks 16x16 luma 2 x 8x8 chroma samples
- Input association of luma and chroma and
conventional block motion displacement - Motion vectors over picture boundaries
- Block Transform
- Variable block-size motion
- I, P and B picture coding types
5High Level Coding Tools
- Sequence and Picture Parameter Sets (SPS PPS)
- Picture Order Count (POC)
- Decoded Picture Buffer (DPB)
- Slice group map (FMO)
- Multiple slices and arbitrary arrangements (ASO)
- Supplemental Enhancement Information (SEI)
- Hypothetical Reference Decoder (HRD)
- Video Usability Information (VUI)
6High Level Tools Coding Hierarchy
- A coded sequence contains one or more access
units - An access unit is a set of NAL units that
contains all necessary - information for decoding exactly one
(primary) coded picture - A coded picture is divided into Slices (VLC NAL
units) - A slice contains a slice header and a set of
macroblocks - A macroblock contains a 16x16 luma block and two
chroma blocks - An I-slice contains a set of INTRA-coded
macroblocks - A P-slice contains a set of INTRA- and
INTER-coded macroblocks - An IDR (instantaneous decoding refresh) picture
contains only I-slices - (SI-slices too in extended profile)
7Sequence Parameter Set
- Profile _at_ Level indicator
- Profile constraint indicator
- Sequence parameter set ID (0..31)
- Picture order count type and infos
- DPB info
- Picture size
- Frame/field coding flag
- Method for vector derivation of B-direct mode
- Frame cropping parameters
- VUI_parameters (Annex E, Video usability
information)
8Picture Parameter Set
- Picture parameter ID (0..255)
- Sequence parameter ID (0..31)
- Entropy coding mode flag (CABAC/CAVLC)
- Slice POC info presence flag
- Slice group map parameters
- Max. number (1..16) of ref. frames used for
decoding slices - Weighted prediction flags
- Quantization scales (qp minus 26, range -26 ..
25) - Chroma QP offset for loop-filter (-12 ..12)
- Slice loop-filter control flag (Alpha/Beta table
offsets) - INTRA predication using pixels of INTER
neighboring MBs? - Slice redundant pic. parameters presence flag
9Slice Header
- Starting macroblock address
- Slice type (I, P, B, SI, SP )
- Temporal reference (frame_num)
- Picture parameter set ID (0..255)
- Interlaced frame/field coding, top/bottom field
indicators - IDR pictire ID (0, 65536)
- Slice POC parameters
- Redundant picture count(0.. 127, 0 for
baseline) - B-slice temporal or spatial direct mode
indicator - Max. number (1..32) of ref. pictures for
decoding current slice - Reference picture reordering parameters (DPB)
- Weighted prediction parameters
- DPB marking parameters (e.g. short term, long
term pred. Pics) - Slice delta QP (-26 ..25)
- SP switch flag and SP/SI slice QP
- Loop-filter indicator (0 disabled, 1 enabled,
2 enabled but LP across slice - Boundaries disabled)
- Loop-filter alpha/beta table access offset (-6,
6) - Slice group change cycle (derives the No. of
MBs in slice group 0)
10Slice Group Maps
11Ordering of Slices within Slice Groups
12Low Level Coding Tools
- Motion compensated prediction
- Additional intra modes for spatial compensation
- Transform 4x4 Integer transform (Baseline, Main
Profiles) - Transform 8x8 Integer transform (High Profile)
- Quantization Scalar quantization
- Entropy Coding CABAC / CAVLC
- In-loop deblocking filter
13Enhanced MC (Inter Prediction)
- Every macroblock can be split in one of 7 ways
for improved motion estimation - Accuracy of motion compensation 1/4 pixel
- Up to 5 reference frames for SDTV size _at_ L3
- Weighted predictions
- Reference B pictures
14B Slice - Direct Mode
- Direct mode
- Forward / backward pair of bi-directional
prediction - Prediction signal is calculated by a linear
combination of two blocks that are determined by
the forward and backward motion vectors pointing
to two reference pictures.
mvL0 tb ? mvCol / td mvL1 (td tb) ?
mvCol / td where mvCol is a MV used in the
co-located MB of the subsequent picture
15B Slice Multi-picture Reference Mode
- Generalized Bidirectional prediction
- Multiple reference pictures mode
- Two forward references proper for a region just
before scene change - Two backward references proper for a region
just after scene change
16H.264 Intra Prediction
- 4 modes for 16x16 intra prediction
17Luma Sub-Pixel Interpolation
18Chroma Sub-pel Calculation
If (vx, vy) is luma vector, then xFracc vx0x7,
yFracc vy0x7
19Block Scanning Order in a MB
20Transform Quant
- Integer 4x4 DCT approximation. 8x8
- Cost of transformed differences (i.e. residual
coefficients) for 4x4 block using 4 x 4
Hadamard-Transformation for INTRA_16x16 coded
macroblocks. - Scalar quantization.
4x4 Luma/Chroma AC
8x8 Luma-Chroma
Hadamard
21Interlaced Coding
- Deblocking filter
- Frame / Field Adaptation
- Picture Adaptive Frame Field (PicAFF).
- Macroblock Adaptive Frame Field (MBAFF)
- Field scan and zig-zag scan options
Field Scan
Zig-zag Frame Scan
22Entropy Coding
- Universal Variable Length Coding (UVLC) using
Exp-Golomb codes. - Context Adaptive VLC (CAVLC)
- Context Adaptive Binary Arithmetic Coding (CABAC)
23CAVLC
Zigzag order 50 33 27 20 0 5 0 0 1 -1 0 0 0 0 0 0
TotalCoeff 7 Trailing 1s 2 Sign Trail
1 0 (reverse order) Levels 5 20 27 33 50
TotalZeros 3 RunsBefore 0 2 1
24Exp Golomb Coding
25Loop filter
26Profiles and Tools
27H.264 Profiles and Tools Graphical Representation
28FRExt Fidelity Range Extension
- Lossless representation
- Allows more than 8-bits per sample (upto 12-bits)
- Higher resolution for color representation
(422, 444) - Source editing function like alpha blending
- Very high bit-rates (often with constant quality)
- Very high-resolution
- Color space transformation (YCgCo, YCbCr, RGB)
- RGB color representation
- Adaptive block transform sizes
- Quantization matrices
29Coding Efficiency
30Comparision of Standards
31Comparision of Standards (contd..)
32References
- Related group
- MPEG website http//www.mpeg.org
- JVT website ftp//ftp.imtc-files.org/jvt-experts
- www.mpegif.org
- Test software
- H.264/AVC JM Software http//bs.hhi.de/suehring/
tml/download - Test sequences
- http//ise.stanford.edu/video.html
- http//kbs.cs.tu-berlin.de/stewe/vceg/sequences.h
tm - http//www.its.bldrdoc.gov/vqeg
- ftp.tnt.uni-hannover.de/pub/jvt/sequences/
- http//trace.eas.asu.edu/yuv/yuv.html
33 THANKS