Title: Introduction to JPEG and MPEG
1Introduction to JPEG and MPEG
- Ingemar J. Cox
- University College London
2Outline
- Elementary information theory
- Lossless compression
- Quantization
- Fundamentals of images
- Discrete Cosine Transform (DCT)
- JPEG
- MPEG-1, MPEG-2
3Bibliography
- D. MacKay, Information Theory, Inference and
learning Algorithms, Cambridge University Press,
2003. http//www.inference.phy.cam.ac.uk/itprnn/bo
ok.html - W. B. Pennebaker and J. L. Mitchell, JPEG Still
Image Data Compression Standard, Chapman Hall,
1993 (ISBN 0-442-01272-1). - G. K. Wallace, The JPEG Still-Picture
Compression Standard, IEEE Trans. On Consumer
Electronics, 38, 1, 18-34, 1992. - http//en.wikipedia.org/wiki/JPEG
4Bibliography
- http//en.wikipedia.org/wiki/MPEG-2
- T. Sikora, MPEG Digital Video-Coding Standards,
IEEE Signal Processing Magazine, 82-100,
September 1997
5Elementary Information Theory
6Elementary Information Theory
- How much information does a symbol convey?
- Intuitively, the more unpredictable or surprising
it is, the more information is conveyed. - Conversely, if we strongly expected something,
and it occurs, we have not learnt very much
7Elementary Information Theory
- If p is the probability that a symbol will occur
- Then the amount of information, I, conveyed
is - The information, I, is measured in bits
- It is the optimum code length for the symbol
8Elementary Information Theory
- The entropy, H, is the average information per
symbol - Provides a lower bound on the compression that
can be achieved
9Elementary Information theory
- A simple example. Suppose we need to transmit
four possible weather conditions - Sunny
- Cloudy
- Rainy
- Snowy
- If all conditions are equally likely, p(s)0.25,
and H2 - i.e. we need a minimum of 2 bits per symbol
10Elementary information theory
- Suppose instead that it is
- Sunny 0.5 of the time
- Cloudy 0.25 of the time
- Rainy 0.125 of the time, and
- Snowy 0.125 of the time
- Then the entropy is
11Elementary Information Theory
- Variable length codewords
- Huffman code integer code lengths
- Arithmetic codes non-integer code lengths
12Elementary Information Theory
Weather Probability Information Integer code
Sunny 0.5 1 0
Cloudy 0.25 2 10
Rainy 0.125 3 110
Snowy 0.125 3 111
13Elementary Information Theory
- Previous illustration is an example of a lossless
code - I.e. we are able to recover the information
exactly
14Elementary Information Theory
- Note that we have assumed that each symbol is
independent of the other symbols - I.e. the current symbol provides no information
regarding the next symbol
15Quantization
- Quantization is the process of approximating a
continuous (or range of values) by a (much)
smaller range of values - Where Round(y) rounds y to the nearest integer
- ? is the quantization stepsize
16Quantization
0
1
-3
-2
-1
2
3
4
5
-5
-4
0
-1
1
2
-2
0
-2
2
4
-4
17Quantization
- Quantization plays an important role in lossy
compression - This is where the loss happens
18Fundamentals of Images
19Fundamentals of images
- An image consists of pixels (picture elements)
- Each pixel represents luminance (and colour)
- Typically, 8-bits per pixel
20Fundamentals of images
- Colour
- Colour spaces (representations)
- RGB (red-green-blue)
- CMY (cyan-magenta-yellow)
- YUV
- Y 0.3R0.6G0.1B (luminance)
- UR-Y
- VB-Y
- Greyscale
- Binary
21Fundamentals of images
- A TV frame is about 640x480 pixels
- If each pixels is represented by 8-bits for each
colour, then the total image size is - 6404803921,600 bytes or ?7.4Mbits
- At 30 frames per second, this would be
- ? 220Mbits/second
22Fundamentals of images
- Do we need all these bits?
23Fundamentals of images
- Here is an image represented with 8-bits per pixel
24Fundamentals of images
- Here is the same image at 7-bits per pixel
25Fundamentals of images
26Fundamentals of images
27Fundamentals of images
28Fundamentals of images
- Do we need all these bits?
- No!
- The previous example illustrated the eyes
sensitivity to luminance - We can build a perceptual model
- Only code what is important to the human visual
system (HVS) - Usually a function of spatial frequency
29Fundamentals of Images
- Just as audio has temporal frequencies
- Images have spatial frequencies
- Transforms
- Fourier transform
- Discrete cosine transform
- Wavelet transform
- Hadamard transform
30Discrete cosine transform
31Basis functions
32Basis functions
33Basis functions
34Basis functions
35Basis functions
36Basis functions
37Basis functions
38Basis functions
39DCT Example
40Example
41Example
- DCT coefficients are
- 4.2426
- 0
- -3.1543
- 0
- 0
- 0
- -0.2242
- 0
42Example DCT decomposition
43Example DCT decomposition
44Example DCT decomposition
45Example summation of DCT terms
- First two non-zero coefficients
46Example summation of DCT terms
- All 3 non-zero coefficients
47Example
- What if we quantize DCT coefficients?
- ?1
- Quantized DCT coefficients are
- 4
- 0
- -3
- 0
- 0
- 0
- 0
- 0
48Example
- Approximate reconstruction
49Example
502-D DCT Transform
- Let i(x,y) represent an image with N rows and M
columns - Its DCT I(u,v) is given by
- where
51Fundamentals of images
- Discrete cosine transform
- Coefficients are approximately uncorrelated
- Except DC term
- C.f. original 88 pixel block
- Concentrates more power in the low frequency
coefficients - Computationally efficient
- Block-based DCT
- Compute DCT on 88 blocks of pixels
52Fundamentals of images
- Basis functions for the 88 DCT (courtesy
Wikipedia)
53Fundamentals of JPEG
54Fundamentals of JPEG
Encoder
DCT
Quantizer
Entropy coder
Compressed image data
IDCT
Dequantizer
Entropy decoder
Decoder
55Fundamentals of JPEG
- JPEG works on 88 blocks
- Extract 88 block of pixels
- Convert to DCT domain
- Quantize each coefficient
- Different stepsize for each coefficient
- Based on sensitivity of human visual system
- Order coefficients in zig-zag order
- Entropy code the quantized values
56Fundamentals of JPEG
- A common quantization table is
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
57Fundamentals of JPEG
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
58Fundamentals of JPEG
- Entropy coding
- Run length encoding followed by
- Huffman
- Arithmetic
- DC term treated separately
- Differential Pulse Code Modulation (DPCM)
- 2-step process
- Convert zig-zag sequence to a symbol sequence
- Convert symbols to a data stream
59Fundamentals of JPEG
- Modes
- Sequential
- Progressive
- Spectral selection
- Send lower frequency coefficients first
- Successive approximation
- Send lower precision first, and subsequently
refine - Lossless
- Hierarchical
- Send low resolution image first
60Fundamentals of MPEG-1/2
61Fundamentals of MPEG
- A sequence of 2D images
- Temporal correlation as well as spatial
correlation - TV broadcast
- Frame-based
- Field-based
62MPEG
- Moving Picture Experts Group
- Standard for video compression
- Similarities with JPEG
63MPEG
- Design is a compromise between
- Bit rate
- Encoder/decoder complexity
- Random access capability
64MPEG
- Images
- Spatial redundancy
- Perceptual redundancy
- Video
- Spatial redundancy
- Intraframe coding
- Temporal redundancy
- Interframe coding
- Perceptual redundancy
65MPEG
- Consider a sequence of n frames of video.
- It consists of
- I-frames
- P-frames
- B-frames
- A sequence of one I-frame followed by P- and
B-frames is known as a GOP - Group of Pictures
- E.g. IBBPBBPBBPBBP
66MPEG
- I-frames
- Intraframe coded
- No motion compensation
- P-frames
- Interframe coded
- Motion compensation
- Based on past frames only
- B-frames
- Interframe coded
- Motion compensation
- Based on past and future frames
67MPEG
- Motion-compensated prediction
- Divide current frame, i, into disjoint 1616
macroblocks - Search a window in previous frame, i-1, for
closest match - Calculate the prediction error
- For each of the four 88 blocks in the
macroblock, perform DCT-based coding - Transmit motion vector entropy coded prediction
error (lossy coding)
68MPEG
- Like JPEG, the DC term is treated separately
- DPCM
- B-frame compression high
- Need buffer and delay