Source Coding - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Source Coding

Description:

If wa is not in the dictionary, Write the index of w in the ... dictionry, output 0 add aa to dictionary 0 aa 3 ... with b, store ba in dictionary 1 ba 5 ... – PowerPoint PPT presentation

Number of Views:365
Avg rating:3.0/5.0
Slides: 53
Provided by: hanv
Category:

less

Transcript and Presenter's Notes

Title: Source Coding


1
Source Coding
  • Data Compression
  • June 2009
  • A.J. Han Vinck

2
DATA COMPRESSION
  • NO LOSS of information and exact reproduction
  • (low compression ratio 14)
  • general problem statement
  • find a means for spending as little time as
    possible on packing as much of data as possible
  • into as little space as possible, and with no
    loss of information

3
GENERAL IDEA
  • represent likely symbols with short length
    binary words
  • where likely is derived from
  • - prediction of next symbol in source output
  • q-ue q-ua q-ui q-uo
  • q ?
  • q-00 q-01 q-10 q-11
  • context between the source
    symbols words sounds context in pictures

4
Why compress?
  • - Lossless compression often reduces file size by
    40 to 80.
  • - More economical to transport and store
  • - Most Internet content is compressed for
    transmission
  • - Compression before encryption can make
    code-breaking difficult
  • - Conserve battery power and storage space on
    mobile devices
  • - Compression and decompression can be hardwired

5
Some history
  • 1948 Shannon-Fano coding
  • 1952 Huffman coding
  • reduced redundancy in symbol coding
  • demonstrably optimal fixed-length coding
  • 1977 Lempel-Ziv coding
  • first major dictionary method
  • maps repeated word patterns to code words

6
MODEL KNOWLEDGE
  •   best performance exact prediction!
  •  exact prediction no new information!
  •  no new information no message to
    transmit 

7
Example No prediction
  • source C
  • message 0 1 2 3 4
    5 6 7
  • code 000 001 010 011 100 101 110
    111
  •  
  • representation length 3

8
Example with prediction
  • ENCODE DIFFERENCE
  • probability .25 .5 .25
  • difference -1 0 1
  • code 00 1 01 
  • source C code
  • -
  • P
  • L .25 2 .5 1 .25 2 1.5
    bit/difference symbol
  •  

9
binary tree codes
the relation between source symbols and
codewords
A 11 B 10 C 0
1 0
1 0
code
General Properties - every node has two
successors leaves or/and nodes - the way to
reach a leave gives the connected codeword -
source letters are only assigned to leaves
i.e. no codeword is prefix of another code word
10
tree codes
Tree codes are prefix codes and uniquely
decodable i.e. a string of codewords can be
uniquely decomposed into the individual
codewords Non-prefix codes may be uniquely
decodable example A 1 B 10 C 100
11
binary tree codes
The average codeword length
Property an optimal code has minimum
L Homework show that L sum (node
probabilities)
12
Tree encoding (1)
  • for data / text the compression should be
    lossless? no errors
  • STEP 1 assign messages to nodes
    codeword niP(i)
  • a 0.5 1 1 1 1 1.5
  • 1
  • b 0.25 0 0 1 1 0.75
  • 1
  • c 0.125 0 0 1 0.25
  • d 0.0625 1 1 0 0.125
  • 0
  • e 0.0625 0 0 0 0.125
  • AVERAGE CODEWORD LENGTH 2.75 bit/source symbol

13
Tree encoding (2)
  • STEP 2 OPTIMIZE ASSIGNMENT (MINIMIZE average
    length )
  • codeword
    niP(i)
  • e 0.0625 1 1111 0.25
  • 1
  • d 0.0625 0 0111
    0.25
  •   1
  • c 0.125        0 011
    0.375
  •   1
  • b 0.25
    0 01 0.5
  •  

  • a 0.5 0 0 0.5
  • AVERAGE CODEWORD LENGTH 1.875 bit/source
    symbol !

14
Kraft inequality
  • Prefix codes with M code words satisfy the Kraft
    inequality
  • where nk is the code word length for message k
  • Proof let nM be the longest codeword length
  • then, in a code tree of depth nM, the terminal
    nodes eliminate
  • from the total number of available nodes

15
example
eliminates 2
eliminates 4
eliminates 8
Depth 4
Homework can we replace into in the Kraft
inequality?
16
Kraft inequality
  • Suppose that the length specification of M code
    words satisfies the Kraft inequality,
  • then
  • where Ni is the number of code words of length
    i.
  • Then, we can construct a prefix code with the
    specified lengths.
  • Note that

17
Kraft inequality
  • From this,
  • Interpretation at every level less nodes used
    than available!
  • E.g. for level 3, we have 8 nodes minus the nodes
    cancelled by
  • Level 1 and 2.

18
performance
  • Suppose that we select the code word lengths as
  • Then, a prefix code exists, since
  • with average length

19
Lower bound for prefix codes
  • We show that
  • We write
  • Equality can be established for

20
Huffman coding (1)
The average codeword length
Property an optimal code has minimum
L Property for an optimal code the two least
probable codewords have the same length, are
the longest by manipulating the assignment
differ only in the last code digit
Homework proof
21
Huffman Coding optimality (2)
Given code C with average length L and M symbols
construct C ( For Cthe codewords for least
probable symbols differ in last digit
) 1. replace the 2 least probable symbols CM
and CM-1 in C by symbol CM-1 with
probability P(M-1) P(M) P(M-1) 2.
to minimize L, we have to minimize L.
22
Huffman Coding (JPEG, MPEG, MP3)
  • 1 take together smallest probabilites
    P(i) P(j)
  • 2 replace symbol i and j by new symbol
  • 3 go to 1 - until end
  • Example code
  • 0.3 0.3 0.3 11
  • 0.55
  • 0.25 0.25 0.25 01
  • 1.00
  • 0.25 0.25 10 0.45 0.45
  • 0.1 100
  • 0.2
  • 0.1                                              
                                                      
         000 
  •  

23
Properties
  • ADVANTAGES
  • uniquely decodable code
  • smallest average codeword length
  • DISADVANTAGES
  • LARGE tables give complexity
  • variable word length
  • sensitive to channel errors

24
Conclusion Huffman
  • Tree coding (Huffman) is not universal!
  •   it is only valid for one particular type of
    source!
  •  
  • For COMPUTER DATA data reduction is
  • lossless? no errors at reproduction
  • universal? effective for different types of data
  •  
  •  

25
Some comments
  • The Huffman code is not unique, but efficiency is
    the same!
  • For alphabets larger than 2 small modification
    necessary (where?)

26
Performance Huffman
  • Using the probability distribution for the source
    U, a prefix code exists with average length
  • L lt H(U) 1
  • Since Huffman is optimum, this bound is also true
    for Huffman codes.
  • Problem if H(U) ? 0
  • Improvements can be made when we take J symbols
    together, then
  • JH(U) L lt J H(U) 1
  • and
  • H(U) L L/J lt H(U) 1/J

27
Example
28
Example
s1 Pr(s1)0.1
s2 Pr(s2)0.25
s3 Pr(s3)0.2
s4 Pr(s4)0.45
0.3
0.55
1
29
Encoding idea Lempel Ziv Welch-LZW
Assume we have just read a segment w from the
text. a is the next symbol.
a
w
  • If wa is not in the dictionary,
  • Write the index of w in the output file.
  • Add wa to the dictionary, and set w ? a.

a
  • If wa is in the dictionary,
  • Process the next symbol with segment wa.

30
Encoding example
  • address 0 a address 1 b address 2 c
  • String
  • a a b a a c a b c a b c
    b output update
  • a a aa not in dictionry, output 0 add aa
    to dictionary 0 aa 3
  • a a b continue with a, store ab in dictionary
    0 ab 4
  • a a b a continue with b, store ba in
    dictionary 1 ba 5
  • a a b a a c aa in dictionary, aac not,
    3 aac 6
  • a a b a a c a 2 ca 7
  • a a b a a c a b c 4 abc 8
  • a a b a a c a b c a b 7 cab 9

31
UNIVERSAL (LZW) (decoder)
  • Start with basic symbol set
  • 2. Read a code c from the compressed file.
  • - The address c in the dictionary determines the
    segment w.
  • - write w in the output file.
  • 3. Add wa to the dictionary a is the first
    letter of the next segment

32
Decoding example
  • address 0 a address 1 b address 2 c
  • String input update
  • a ? output a 0
  • a a ! output a determines ? a, update aa
    0 aa 3
  • a a b . output 1 determines !b, update ab
    1 ab 4
  • a a b a a . 3 ba 5
  • a a b a a c . 2 aac 6
  • a a b a a c a b . 4 ca
    7
  • a a b a a c a b c a .
    7 abc 8

33
Conclusion (LZW)
  • IDEA TRY to copy long parts of source output
  • if overflow
  • throw least-recently used entry away in en- and
    decoder
  • universal
  • lossless

Homework encode/decode the sequence
1001010110011... Try to solve the problem that
occurs!
34
Some history
  • GIF, TIFF, V.42bis modem compression standard,
    PostScript Level 2
  • 1977 published by Abraham Lempel and Jakob Ziv
  • 1984 LZ-Welch algorithm published in IEEE
    Computer
  • Sperry patent transferred to Unisys (1986)
  • GIF file format Required use of LZW algorithm

35
references
J. Ziv and A. Lempel, A Universal Algorithm for
Sequential Data Compression, IEEE Transactions on
Information Theory, May 1977. Terry Welch, A
Technique for High-Performance Data Compression,
Computer, June 1984.
36
Summary of operations
  • ENCODING output update location
  • W1 A loc( W1 ) W1A
    N
  • W2 F loc( W2 ) W2
    F N1
  • W3 X
    loc( W3 ) W3 X N2
  • DECODE INPUT update location
  • loc( W1 ) W1 ?
  • loc( W2 ) W2 ?
    W1A N
  • loc( W3) W3 ? W2 F
    N1

37
Problem and solution
  • ENCODING output update location
  • W1 A loc( W1 ) W1A N
  • W2 W1 A F loc( W2 ) W2 F
    N1
  • DECODE INPUT update location
  • loc( W1 ) W1 ?
  • loc( W2 W1 A) W2
    W1A N
  • Since W2 W1 A, the ? can be solved ? W2
    updated at location N as W1A

38
Shannon-Fano coding
Suppose that we have a source with M symbols.
Every symbol ui occurs with probability
P(ui). We try to encode symbol ui with
bits Then the average representation length is
39
code realization
Define
40
continued
Define The codeword for ui is the binary
expansion for Q(ui) of length ni Property The
code is a prefix code with the promised
length Proof Let i ? k1
41
continued
  • The binary radix-2 representation for Q(ui) and
    Q(uk) differ at least in position nk.
  • The codewords for Q(ui) and Q(uk) have length
  • The truncated representation for Q(uk) can never
    be a prefix for the codeword ni.

42
example
P(u0 u1 u2 u3 u4 u5 u6 u7)(5/16, 3/16,1/8, 1/8,
3/32, 1/16, 1/16, 1/32)
43
Enumerative coding
suppose pn ones in long sequence of length
n. According to Shannon we need nh(p) bits
to represent every sequence How do we realize
the encoding and decoding?
44
Enumerative coding
  • Solution do lexicographical ordering
  • Example 2 ones in sequence of length 6
  • 1 1 0 0 0 0
  • 9 0 1 1 0 0 0
  • 8 0 1 0 1 0 0
  • 7 0 1 0 0 1 0
  • 6 0 1 0 0 0 1
  • 0 0 1 1 0 0
  • 0 0 1 0 1 0
  • 3 0 0 1 0 0 1
  • 0 0 0 1 1 0
  • 1 0 0 0 1 0 1
  • 0 0 0 0 0 1 1

Encode Sequence of sequences with lower
lexicographical order Decode reconstruct
sequence with sequence
45
Enumerative encoding
Example index for sequence 0 1 0 1 0 0
8 0 1 0 1 0 0
There are 2 sequences with prefix 0 length 2
and with 1 one
There are 6 sequences with prefix 0 length 4
and with 2 ones
46
Enumerative decoding
Given sequence of length 6 with 2 ones What is
the sequence for index 8 ? There are 10
sequences with prefix 0, length 5 and 2
ones Hence, sequence starts with 0 There are 6
sequences with prefix 00, length 4 and 2 ones
Hence, sequence starts with 01 01 6 There
are 3 sequences with prefix 010, length 3 and 1
one Hence, sequence starts with 010 and not
011 010 6 There are 2 sequences with prefix
0100, length 2 and 1 one Hence, sequence starts
with 0101 010100 8
47
Enumerative encoding performance
The number of bits per n source outputs for pn
ones
Asymptotically Efficiency ? h(p) bits per source
output
Note added for words of length n, - encode
first the number of ones in a block with
log2(n1) bits, - then do the enumerative
encoding with h(p) bits per source output The
contribution (log2(n1))/n dissappears for large
n!
48
David A. Huffman
In 1951 David A. Huffman and his classmates in an
electrical engineering graduate course on
information theory were given the choice of a
term paper or a final exam. For the term paper,
Huffman's professor, Robert M. Fano, had assigned
what at first appeared to be a simple problem.
Students were asked to find the most efficient
method of representing numbers, letters or other
symbols using a binary code. Besides being a
nimble intellectual exercise, finding such a code
would enable information to be compressed for
transmission over a computer network or for
storage in a computer's memory. Huffman worked
on the problem for months, developing a number of
approaches, but none that he could prove to be
the most efficient. Finally, he despaired of ever
reaching a solution and decided to start studying
for the final. Just as he was throwing his notes
in the garbage, the solution came to him. "It
was the most singular moment of my life," Huffman
says. "There was the absolute lightning of
sudden realization."
49
The inventors
LZW (Lempel-Ziv-Welch) is an implementation of a
lossless data compression algorithm created by
Lempel and Ziv. It was published by Terry Welch
in 1984 as an improved version of the LZ78
dictionary coding algorithm developed by Abraham
Lempel and Jacob Ziv.
50
Intuitive Lempel Ziv (be careful !)
  • A source generates independent symbols 0 and 1
  • p(1) 1-p(0) p
  • Then
  • There are roughly 2nh(p) typical sequences,
  • every typical sequence has p(t) ? 2-nh(p)
  • We expect that in a binary sequence of lenth N
    2nh(p) , every typical sequence occurs once
  • (with very high probability)

51
Intuitive Lempel Ziv (be careful !)
  • Idea for the Algorithm
  • Start with an initial sequence of length N
  • Generate a string of length n
  • ( which is typical with high probability)
  • b. Transmit its starting position in the string
    of length N with log2N bits
  • if not present, transmit the n bist as they
    occur
  • c. Delete the first n bits of the initial
    sequence and append the newly generated n bits.
    Go back to a, unless end of the source sequence

52
Intuitive Lempel Ziv (be careful !)
  • EFFICIENCY the new n bits are typical with
    probability 1 - ?, where ? ? 0
  • - if non typical, transmit 0, followed by the n
    bits
  • - if typical, transmit 1, followed by log2N bits
    for the position in the block
  • hence average bits/source output
  • ? (1-?)(log2N)/n ? 1/n ? h(p) bits/source
    output for large n and ? ? 0!
  • NOTE
  • - if p changes, we can adapt N and n, or choose
    some worst case value in advance
  • - the typical words can also be stored in a
    memory. The algorithm then outputs the location
    of the new word. Every time a new word is entered
    into the memory and one word is deleted.
  • Why is this not a good solution?
Write a Comment
User Comments (0)
About PowerShow.com