Why Compress - PowerPoint PPT Presentation

About This Presentation
Title:

Why Compress

Description:

To reduce the bandwidth required for transmission and to reduce storage ... String of characters with occurrence probabilities make up a message ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 27
Provided by: padmamundu
Category:
Tags: compress

less

Transcript and Presenter's Notes

Title: Why Compress


1
Why Compress?
  • To reduce the volume of data to be transmitted
    (text, fax, images)
  • To reduce the bandwidth required for transmission
    and to reduce storage requirements (speech,
    audio, video)

2
Compression
  • How is compression possible?
  • Redundancy in digital audio, image, and video
    data
  • Properties of human perception
  • Digital audio is a series of sample values image
    is a rectangular array of pixel values video is
    a sequence of images played out at a certain rate
  • Neighboring sample values are correlated

3
Redundancy
  • Adjacent audio samples are similar (predictive
    encoding) samples corresponding to silence
    (silence removal)
  • In digital image, neighboring samples on a
    scanning line are normally similar (spatial
    redundancy)
  • In digital video, in addition to spatial
    redundancy, neighboring images in a video
    sequence may be similar (temporal redundancy)

4
Human Perception Factors
  • Compressed version of digital audio, image, video
    need not represent the original information
    exactly
  • Perception sensitivities are different for
    different signal patterns
  • Human eye is less sensitive to the higher spatial
    frequency components than the lower frequencies
    (transform coding)

5
Classification
  • Lossless compression
  • lossless compression for legal and medical
    documents, computer programs
  • exploit only data redundancy
  • Lossy compression
  • digital audio, image, video where some errors or
    loss can be tolerated
  • exploit both data redundancy and human perception
    properties
  • Constant bit rate versus variable bit rate coding

6
Entropy
  • Amount of information I in a symbol of occurring
    probability p I log2(1/p)
  • Symbols that occur rarely convey a large amount
    of information
  • Average information per symbol is called entropy
    H
  • H pix log2(1/pi) bits per codeword
  • Average number of bits per codeword Nipi
    where Ni is the number of bits for the symbol
    generated by the encoding algorithm

7
Huffman Coding
  • Assigns fewer bits to symbols that appear more
    often and more bits to the symbols that appear
    less often
  • Efficient when occurrence probabilities vary
    widely
  • Huffman codebook from the set of symbols and
    their occurring probabilities
  • Two properties
  • generate compact codes
  • prefix property

8
Run-length Coding
  • Repeated occurrence of the same character is
    called a run
  • Number of repetition is called the length of the
    run
  • Run of any length is represented by three
    characters
  • eeeeeee7tnnnnnnnn
  • _at_e7t_at_n8

9
Lempel-Ziv-Welch (LZW) Coding
  • Works by building a dictionary of phrases from
    the input stream
  • A token or an index is used to identify each
    distinct phrase
  • Number of entries in the dictionary determines
    the number of bits required for the index -- a
    dictionary with 25,000 words requires 15 bits to
    encode the index

10
Arithmetic Coding
  • String of characters with occurrence
    probabilities make up a message
  • A complete message may be fragmented into
    multiple smaller strings
  • A codeword corresponding to each string is found
    separately

11
Summary
  • Statistical encoding exploits the fact that not
    all symbols in the source information occur with
    equal probability
  • Variable length codewords are used with the
    shortest ones used to encode symbols that occur
    most frequently
  • Static coding -- text type is pre-defined and
    codewords are derived once and used for all
    subsequent transfers
  • Dynamic coding -- type of text may vary from one
    transfer to another and same set of codewords are
    generated at the transmitter and the receiver as
    the transfer takes place

12
Image and Video Compression
  • Two dimensional array of pixel values
  • Spatial redundancy and temporal redundancy
  • Human eye is less sensitive to chrominance signal
    than to luminance signal (U and V can be coarsely
    coded)
  • Human eye is less sensitive to the higher spatial
    frequency components
  • Human eye is less sensitive to quantizing
    distortion at high luminance levels

13
JPEG Encoder
  • International standards body -- Joint
    Photographic Experts Group
  • JPEG encoder schematic
  • Image/block preparation
  • DCT computation
  • Quantization
  • Entropy coding -- vectoring, differential
    encoding, run-length encoding, Huffman encoding
  • Frame building

14
Image/block Preparation
  • Source image as 2-D matrix of pixel values
  • R, G, B format requires three matrices, one each
    for R, G, B quantized values
  • In Y, U, V representation, the U and V matrices
    can be half as small as the Y matrix
  • Source image matrix is divided into blocks of 8X8
    submatrices
  • Smaller block size helps DCT computation and
    individual blocks are sequentially fed to the DCT
    which transforms each block separately

15
DCT Computation
  • Each pixel value in the 2-D matrix is quantized
    using 8 bits which produces a value in the range
    of 0 to 255 for the intensity/luminance values
    and the range of -128 to 127 for the
    chrominance values. All values are shifted to the
    range of -128 to 127 before computing DCT
  • All 64 values in the input matrix contribute to
    each entry in the transformed matrix
  • The value in the location F0,0 of the
    transformed matrix is called the DC coefficient
    and is the average of all 64 values in the matrix
  • The other 63 values are called the AC
    coefficients and have a frequency coefficient
    associated with them
  • Spatial frequency coefficients increase as we
    move from left to right (horizontally) or from
    top to bottom (vertically). Low spatial
    frequencies are clustered in the left top corner.

16
Quantization
  • The human eye responds to the DC coefficient and
    the lower spatial frequency coefficients
  • If the magnitude of a higher frequency
    coefficient is below a certain threshold, the eye
    will not detect it
  • Set the frequency coefficients in the transformed
    matrix whose amplitudes are less than a defined
    threshold to zero (these coefficients cannot be
    recovered during decoding)
  • During quantization, the size of the DC and AC
    coefficients are reduced
  • A division operation is performed using the
    predefined threshold value as the divisor

17
Quantization Table
  • Threshold values vary for each of the 64 DCT
    coefficients and are held in a 2-D matrix
  • Trade off between the level of compression
    required and the information loss that is
    acceptable
  • JPEG standard includes two default quantization
    tables -- one for the luminance coefficients and
    the other for use with the two sets of
    chrominance coefficients. Customized tables may
    be used

18
Entropy Coding
  • Vectoring -- 2-D matrix of quantized DCT
    coefficients are represented in the form of a
    single-dimensional vector
  • After quantization, most of the high frequency
    coefficients(lower right corner) are zero.
  • To exploit the number of zeros, a zig-zag scan of
    the matrix is used
  • Zig-zag scan allows all the DC coefficients and
    lower frequency AC coefficients to be scanned
    first
  • DC are encoded using differential encoding and AC
    coefficients are encoded using run-length
    encoding. Huffman coding is used to encode both
    after that.

19
Differential Encoding
  • DC coefficient is the largest in the transformed
    matrix.
  • DC coefficient varies slowly from one block to
    the next.
  • Only the difference in value of the DC
    coefficients is encoded. Number of bits required
    to encode is reduced.
  • The difference values are encoded in the form
    (SSS, value) where SSS field indicates the number
    of bits needed to encode the value and the value
    field indicates the binary form.

20
Run-length Encoding
  • 63 values of the AC coefficients
  • Long strings of zeros because of the zig-zag scan
  • Each AC coefficient encoded as a pair of values
    -- (skip, value), skip indicates the number of
    zeros in the run and value is the next non-zero
    coefficient

21
Huffman Encoding
  • Long strings of binary digits replaced by shorter
    codewords
  • Prefix property of the huffman codewords enable
    decoding the encoded bitstream unambiguously

22
Frame Building
  • Encapsulates the information relating to an
    encoded image

23
Video Compression
  • Video as a sequence of pictures (or frames)
  • JPEG algorithm applied to each frame -- moving
    JPEG (MJPEG). Exploits only spatial redundancy.
  • High correlation between successive frames. Only
    small portion of each frame is involved with any
    motion that is taking place.
  • A combination of actual frame contents and
    predicted frame contents are used.
  • Motion estimation and motion compensation

24
Frame/Picture Types
  • Interframe and intraframe coding. High
    compression ratios can be achieved by using both.
    Random access requirement of image retrieval is
    satisfied by pure intraframe coding.
  • I-frames are coded without reference to other
    frames. Serve as reference pictures for
    predictive-coded frames.
  • P-frames are coded using motion compensated
    prediction from a past I-frame or P-frame.
  • B-frames are bidirectionally predictive-coded.
    Highest degree of compression, but require both
    past and future reference pictures for motion
    compensation.
  • D-frames are DC-coded. Of the DCT coefficients
    only the DC coefficients are present. Used in
    interactive applications like VoD for rewind and
    fast-forward operations.

25
Picture Sequence
  • I B B P B B P B B I (display order)
  • Bitstream order -- I P B B P B B P B B I
  • Prediction span, Group of Pictures (GOP)

26
MPEG-video Encoding
  • Input frames are preprocessed (color space
    conversion and spatial resolution adjustment).
  • Frame types are decided for each frame/picture
  • Each picture is divided into macroblocks of 16 X
    16 pixels.
  • Macroblocks are intracoded for I frames and
    predictive coded or intracoded for P and B frames
  • Macroblocks are divided into six blocks of 8 X 8
    pixels (4 luminance and 2 chrominance) and DCT is
    applied to each block and transform coefficients
    are quantized and zig-zag scanned and
    variable-length coded.
Write a Comment
User Comments (0)
About PowerShow.com