Title: JPEG and MPEG Standards
1JPEG and MPEG Standards
2Outline
- Transform-based Image and Video Coding
- Linear Transformation DCT
- Quantization
- Scalar Quantization
- Vector Quantization
- Entropy Coding
- Video Coding Motion Compensation
3Transform-based Image Coding
Binary bit stream
Input Image
Linear Transform
Quanti- zatioin
Entropy Coding
4Linear Transform
- If the signal is formatted as a vector, a linear
transform can be formulated as a matrix-vector
product that transform the signal into a
different domain. - Examples
- K-L Expansion
- Discrete Fourier Transform
- Discrete cosine transform
- Discrete wavelet transform
- Energy compaction property The transformed
signal vector has few, large coefficients and
many nearly zero small coefficients. These few
large coefficients can be encoded efficiently
with few bits while retaining the majority of
energy of the original signal.
5Block-based Image Coding
- An image is a 2D signal of pixel intensities
(including colors). - A block-based image coding scheme partitions the
entire image into 8 by 8 or 16 by 16 (or other
size) blocks. Then the coding algorithm is
applied to individual blocks independently.
- Blocks may be overlapping or non-overlapping.
- Advantage parallel processing can be applied to
process individual blocks in parallel. For
hand-held devices, only one block needs be loaded
into main memory each time.
6JPEG Image Coding Algorithms
7JPEG Decoding
DC
DC
IDPCM
8x8
Huffman
block
IDCT
IQ
AC
Huffman
AC
JPEG Decoding Process
8Pre-Processing
- Color sub-sampling
- A color image is converted from RGB to YUV color
space. Each pixel in each dimension is 1 byte. - Sub-sample U-V planes 411 scheme.
- For every 16 by 16 block of a color image, six 8
by 8 blocks are encoded. - Level shifting Each pixel value is subtracted by
128 so it ranges (128, 127).
Four 8?8 blocks of luminance pixels, plus two 8?8
sub-sampled chrominance components makes a 16 by
16 macro-block
9Discrete Cosine Transform
- 8?8 two-dimensional separable DCT
- DCT is chosen because it leads to superior energy
compaction for natural images. - F(0,0) DC coefficient ranges (-128x64/4,127x16)
needs 12 bits to represent (including sign bit).
12 bits are more than enough for the remaining AC
coefficients (u gt 0, or v gt 0)
10Inverse DCT (IDCT)
- 8?8 two-dimensional separable IDCT
- IDCT can be computed using the same routine as DCT
11DCT Basis Functions
12Quantization of DCT Coefficients
13DPCM of DC coefficients
- DC coding All DC coefficients of each 8 by 8
blocks of the entire image are combined to make a
sequence of DC coefficients. - Next, DPCM is applied
- DiffDC(blocki) DC(blocki) DC(blocki1)
- Then DiffDCs will be encoded using Hoffman entropy
- Example
- Original
- 1216 ? 1232 ? 1224 ? 1248 ? 1248 ? 1208
- After DPCM
- 1216 ? 16 ? -8 ? 24 ? 0 ? -40
14Huffman Encoding of DC Coefficients
- Encoding and decoding of Huffman code is done via
look-up table. - In JPEG, DC coefficients (after DPCM) are first
grouped according to their magnitudes. Each
category is assigned as a symbol and a Hoffman
table is given. For example, 7 to 4 and 4 to 7
are listed as category 3 which has a code "00.
- If the number is positive, the binary
representation of the number will be append to
the Hoffman code of the category number directly.
For example, 6 is encoded as 00 110. If the
number is negative, the appended code is the 1s
complement of that number. For example, -5 is
encoded as 00 010. - Question Given such a table, how to devise a
dedicated hardware to implement the encoding
procedure?
15JPEG Huffman Table Categories
16JPEG DC Entropy Coding
- Example
- -9 category 4. Hence Base code 101
- 1s complement of (-9) 1C(1001) 0110
- Code word 101 0110 1010110
- Note that category 3 occurs most frequent and
hence has shortest base code word.
17AC Coefficients
- AC coefficients are first weighted with a
quantization matrix - C(i,j)/q(i,j) Cq(i,j)
- Then quantized.
- Then they are scanned in a zig-zag order into a
1D sequence to be subject to AC Huffman encoding. - Question Given a 8 by 8 array, how to convert it
into a vector according to the zig-zag scan
order? What is the algorithm?
Zig-Zag scan order
18AC Coefficients Huffman Encoding
- The symbols for encoding AC coefficient consists
both the number of significant bits, as well as
runs of 0s preceding the nonzero AC coefficient.
For example, - 5 0 2 0 0 1 is encoded as 100101 11100110
110110 - This is according to the table below
19Huffman Decoding
- A look-up table procedure.
- Challenge How to perform decoding fast?
- Example a Huffman table for six symbols
- The decoding process can be modeled as a finite
state machine with the following state diagram.
It decodes one bit of input bit stream per clock
cycle. - Question How to make this process fast enough to
match any input bit rate?
20Huffman Decoding Implementation
- The FSM decoding model decodes one bit per clock
cycle ? constant input rate. - A FSM is a nonlinear recursive equation! ?
look-ahead transform may be applied to expedite
evaluation. - Look-ahead means to exam two or more bits of the
input stream per clock cycle.
- If the maximum code word length is examined per
clock cycle, then it is possible to produce one
output symbol per clock cycle, giving a constant
output rate realization. - The complexity of the combinational logic that is
required for look-ahead transformation is the
most difficult part.
21Video Coding
- Video coding is often implemented as encoding a
sequence of images. Motion compensation is used
to exploit temporal redundancy between successive
frames. - Examples MPEG-I, MPEG-II, MPEG-IV, H.323, H.263,
H.263, etc. - Existing video coding standards are based on JPEG
image compression as well as motion compensation.
22MPEG Encoding
Buffer control
Current frame x(t)
r
Bit stream Buffer
VLC
DCT
Q
?
Q-1
IDCT
Qr(t) reconstructed residue
x(t) predicted frame
x(t) reconstructed current frame
Motion Estimation Compensation
x(t-1)
x(t)
Frame Buffer
This is a simplified block diagram where the
encoding of intra coded frames is not shown.
Motion vectors
23MPEG Decoding
VLD Variable Length Decoding
Received bit stream
Bit stream Buffer
VLD
Q-1
IDCT
Qr(t) reconstructed residue
x(t) predicted frame
x(t) reconstructed current frame
Frame Buffer
Motion Compensation
x(t-1)
Motion vectors
24Motion Estimation
- Three types of frames
- Intra (I) the frame is coded as if it is an
image - Predicted (P) predicted from an I or P frame
- Bi-directional (B) forward and backward
predicted from a pair of I or P frames. - A typical frame arrangement is (subscripts are
used to distinguish them) - I1 B1 B2 P1 B3 B4 P2 B5 B6 I2
- P1, P2 are both forward-predicted from I1. B1, B2
are interpolated from I1 and P1, B3, B4 are
interpolated from P1, P2, and B5, B6 are
interpolated from P2, I2.
25Forward Motion Estimation
1
2
3
4
2
4
1
3
8
5
5
6
7
8
7
6
12
11
9
9
10
11
12
10
15
13
16
13
14
15
16
14
Current frame constructed From different
parts of reference frame
Reference frame
26Block Motion Estimation
- MAD Mean absolute difference between the I,j-th
pixel of the current block x(i,j) and the
(Im,jn)-th pixel of the reference frame. - (-p?m,n ? p) is the motion vector corresponding
to the macro-block. M and N are search range. - It is similar to DPCM in the temporal domain, and
has less to do with object motion.
27Video sequence Tennis frame 0
Prepared by Surin Kittitornkun
28Video sequence Tennis frame 1
Prepared by Surin Kittitornkun
29Frame Difference
Prepared by Surin Kittitornkun
30What is motion estimation?
Prepared by Surin Kittitornkun
31What is motion compensation ?
Prepared by Surin Kittitornkun
32Motion Compensated Frame Difference
Prepared by Surin Kittitornkun