Title: Traditional Hybrid Video Coding
1Group Testing for Video 2.0Dane Barney, Tian Li,
Matt Renzelmann, Mike Ringenburg, Gidon Shavit,
Richard Ladner, and Eve Riskin
GTV (Group Testing for Video) Block Diagram
- Traditional Hybrid Video Coding
- Each frame is compared to the previous, or
reference, frame. - The current frame is predicted by specifying the
motion of pixel blocks between frames. Motion
vectors are losslessly encoded. - Non-linear motion and other changes make
prediction inaccurate. The pixel-by-pixel
difference between prediction and actual frame is
called the residual image. - The residual is transformed (typically with
Discrete Cosine Transform), resulting in
real-valued transform coefficients. - Coefficients are quantized, i.e. rounded down to
a given precision that depends on a per-frame
quantization parameter. - Quantized coefficients are losslessly encoded and
sent to decoder. - The decoder estimates the original coefficient
values inverts the transform and uses the
reconstructed residual and the previously
reconstructed reference frame to recreate the
current frame.
This block diagram for GTV is similar in many
respects to a traditional hybrid coder, such as
H.263, except that GTV uses improved algorithms
for quantization and lossless coefficient
codingthe boxes colored in red.
- Our Coder GTV 2.0
- Based on H.263 and H.264 standards.
- Replaces quantization with bit-plane coding.
- Significance bits are compressed using
context-based group testing. - The key to performance is estimating the
significance probabilities. - Very precise rate-control coding can stop at
almost any given output length, without
sacrificing quality. This enables us to apply
high-quality rate control algorithms. - Encoding and decoding speed needs improvement,
especially when using many reference frames. - Does not incorporate all of the features of
H.264, and so is not yet directly comparable to
either standard.
GTV Then and Now
Motion Prediction Comparison
Half-pixel, bilinear interpolation
- Objects in videos do not move in whole pixel
increments. - Motion prediction performance improves by
interpolating between pixels. - Half pixel bilinear interpolation is used in
older video coders like H.263. - H.264 introduced a new method of interpolation
based on a 6-tap finite impulse response filter
and quarter pixel resolution. - Higher resolution motion vectors allow for higher
quality motion prediction. - GTV uses the same interpolation scheme as H.264.
- The diagrams on the left compare the two schemes
Quarter-pixel, 6-tap FIR filter interpolation
- Variable block sizes further improves prediction
accuracy. - No longer fixed 16x16 blocks, as in H.263.
- Blocks may be divided as shown below, as per
H.264. - Multiple reference frames
- Predict from frames further in the past.
Results
Variable Block Motion Prediction
Block division types
16x16
16x8
8x16
8x8
Average reconstructed image quality for the movie
foreman. GTV 2.0 is presently less efficient
than H.264, but more efficient than H.263 and its
predecessor, GTV 1.0.
8x8
8x4
4x8
4x4