Lossless Compression - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Lossless Compression

Description:

Certain types of data requires that no information is lost due to compression ... Select sub-interval corresponding to the symbol that actually occurs, and make ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 23
Provided by: deann
Category:

less

Transcript and Presenter's Notes

Title: Lossless Compression


1
Chapter 20
  • Lossless Compression

2
Introduction
  • Redundant elements exist in virtually all forms
    of data
  • Compression can be used to reduce communications
    capacity requirements
  • Certain types of data requires that no
    information is lost due to compression
  • Several widely used lossless compression
    techniques are critical to modern network
    performance
  • Run-length encoding
  • Arithmetic coding
  • Ziv-Lempel (LZ) dictionary

3
Coding Techniques
  • Run-length
  • Modified Huffman - MH
  • Modified Relative Element Address Designate
    (READ) MR
  • Modified Modified READ MMR
  • Arithmetic Pure and Interval
  • String Matching LZ77, LZ78, LZW

4
Run Length Encoding
Compression Format
Examples
5
Run-length Encoding Efficiency
From Table 20.1
6
Run Length Coding Image Compression Example
0000000000 0000000000 0001111000 0001001000 000111
1000 0000001000 0000001000 0001111000 0000000000 0
000000000
Image 100 pixels
Binary Code 100 bits
23W 4B 6W 1B 2W 1B 6W 4B 9W 1B 9W 1B 6W 4B
23W or, simply 23 4 6 1 2 1 4 9 1 9 1 6 4 23
Simple Run-Length Coding 15 characters 120 bits
7
Facsimile Compression
  • Characterized by black and white points (pels) on
    a page
  • Group 3
  • 200 ppi (H) x 100 or 200 ppi (V)
  • modem via analog phone line
  • Group 4
  • 200 400 ppi (H) x 200 400 ppi (V)
  • digital networks up to 64Kbps
  • ITU-T define two lossless compression standards
  • MH/MR for Group 3
  • MMR for Group 4

8
MH Modified Huffman Code
  • Due to the inherent characteristics of text
    documents, variable coding can be used
    effectively
  • Huffman encoding applied one line at a time
  • count white and black space, e.g. w7, b5, w3,
    b9,
  • run length N 64m n
  • m 0, 1, 2, , 27 n 0, 1, 2, , 63
  • represent each run of black or white pels as a
    multiple of 64 plus a remainder and assign a
    Huffman code
  • terminating codes (n) used for N lt 64
  • make-up codes (m) needed for N gt 64
  • See Table 20.2

Efficiency can be improved using MR.
9
MR Technique Changing Picture Elements
  • Basis
  • 75 of all transitions can be defined that is /-
    at most 1 pel from the line above it.
  • changing elements (i.e. w to b, or b to w) can
    be identified based on what happened in the line
    before.
  • Encoding, then, is based on vertical as well as
    horizontal relationships between pels

10
MR Code Table
  • Notes
  • MR code is more error-sensitive that MH
  • ITU-T recommends using MH for every Kth scanning
    line K 2 for 3.85 lines/mm and K 4 for 7.7
    lines/mm

11
Facsimile Compression Techniques
1
1 Joint Bi-Level Image Experts Group Coding.
Based on Arithmetic Coding Technique
12
Huffman Coding Revisited
  • Huffman achieves maximum efficiency when all
    probabilities involved are negative powers of 2
  • Example

13
Arithmetic Coding Techniques
  • Designed to provide efficient compression by
    approximating probabilities as negative powers of
    2
  • Used in JPEG and MPEG standards for lossless
    encoding
  • Basic Method (in brief)
  • arrange outcomes on the half-open unit interval
    0,1)
  • approximate lower bound of outcome probabilities
    as a negative power of 2
  • encode symbol string using pure arithmetic or
    interval arithmetic algorithm

14
Arithmetic Coding - Unit Interval Arrangement
Three Symbol (A,B,C), Independent Sequence (from
page 557)
15
Arithmetic Coding Probability Intervals - Example
Three Symbol (A,B,C) Sequence
Drawbacks?
01
16
Pure Arithmetic Coding Technique - Example
  • Algorithm
  • Begin with the half-open interval 0,1)
  • Subdivide current interval into sub-intervals,
    one for each symbol.
  • Select sub-interval corresponding to the symbol
    that actually occurs, and make it the new current
    interval
  • Repeat steps 1 and 2 until the entire message is
    processed
  • 3. Output enough bits to distinguish the final
    interval from two adjacent intervals
  • 4. Output a special end-of-message symbol.

Drawbacks?
17
Interval Arithmetic Coding Technique - Example
18
String-Matching Algorithms
  • Algorithms stem for works by Ziv and Lempel in
    1977 1978, and improvements by Welch in 1978
  • LZ77 version used in PKZIP, gzip, zipit, etc.
  • LZ78 adds improvements based on tree-structured
    dictionary
  • LZW adds performance enhancements
  • LZ78/LZW used in V.42bis, GIF and Unix compress
  • Algorithms all use pattern matching to identify
    repeated symbol sequences
  • Basic Method (in brief)
  • scan input symbols
  • create codes for them on the fly
  • make dictionary entries to record the symbol-code
    pairs

19
LZ77 Scheme - Example
  • Store symbols in fixed-size sliding history
    buffer
  • New input kept in fixed-size look-ahead buffer
  • Attempt to match two or more characters from the
    beginning of the look-ahead buffer with
    characters in the sliding history buffer
  • No match output first look-ahead symbol as 9-bit
    character and shift it into window
  • Match continue to scan for longest match, then
    output triplet (indicator, pointer, length) and
    shift sliding window

20
LZ77 Scheme
Drawbacks?
0b27d5d
21
LZ78/LZW Example
22
LZW Dictionary - Example
23
Tree-Based LZ Dictionary
Root Nodes
Write a Comment
User Comments (0)
About PowerShow.com