Source Coding - PowerPoint PPT Presentation

1 / 52

About This Presentation

Title:

Source Coding

Description:

If wa is not in the dictionary, Write the index of w in the ... dictionry, output 0 add aa to dictionary 0 aa 3 ... with b, store ba in dictionary 1 ba 5 ... – PowerPoint PPT presentation

Number of Views:366

Avg rating:3.0/5.0

Slides: 53

Provided by: hanv

Category:

more less

Transcript and Presenter's Notes

Title: Source Coding

1
Source Coding

Data Compression
June 2009
A.J. Han Vinck

2
DATA COMPRESSION

NO LOSS of information and exact reproduction
(low compression ratio 14)
general problem statement
find a means for spending as little time as
possible on packing as much of data as possible
into as little space as possible, and with no
loss of information

3
GENERAL IDEA

represent likely symbols with short length
binary words
where likely is derived from
- prediction of next symbol in source output
q-ue q-ua q-ui q-uo
q ?
q-00 q-01 q-10 q-11
context between the source
symbols words sounds context in pictures

4
Why compress?

- Lossless compression often reduces file size by
40 to 80.
- More economical to transport and store
- Most Internet content is compressed for
transmission
- Compression before encryption can make
code-breaking difficult
- Conserve battery power and storage space on
mobile devices
- Compression and decompression can be hardwired

5
Some history

1948 Shannon-Fano coding
1952 Huffman coding
reduced redundancy in symbol coding
demonstrably optimal fixed-length coding
1977 Lempel-Ziv coding
first major dictionary method
maps repeated word patterns to code words

6
MODEL KNOWLEDGE

best performance exact prediction!
exact prediction no new information!
no new information no message to
transmit

7
Example No prediction

source C
message 0 1 2 3 4
5 6 7
code 000 001 010 011 100 101 110
111
representation length 3

8
Example with prediction

ENCODE DIFFERENCE
probability .25 .5 .25
difference -1 0 1
code 00 1 01
source C code
-
P
L .25 2 .5 1 .25 2 1.5
bit/difference symbol

9
binary tree codes
the relation between source symbols and
codewords
A 11 B 10 C 0
1 0
1 0
code
General Properties - every node has two
successors leaves or/and nodes - the way to
reach a leave gives the connected codeword -
source letters are only assigned to leaves
i.e. no codeword is prefix of another code word
10
tree codes
Tree codes are prefix codes and uniquely
decodable i.e. a string of codewords can be
uniquely decomposed into the individual
codewords Non-prefix codes may be uniquely
decodable example A 1 B 10 C 100
11
binary tree codes
The average codeword length
Property an optimal code has minimum
L Homework show that L sum (node
probabilities)
12
Tree encoding (1)

for data / text the compression should be
lossless? no errors
STEP 1 assign messages to nodes
codeword niP(i)
a 0.5 1 1 1 1 1.5
1
b 0.25 0 0 1 1 0.75
1
c 0.125 0 0 1 0.25
d 0.0625 1 1 0 0.125
0
e 0.0625 0 0 0 0.125
AVERAGE CODEWORD LENGTH 2.75 bit/source symbol

13
Tree encoding (2)

STEP 2 OPTIMIZE ASSIGNMENT (MINIMIZE average
length )
codeword
niP(i)
e 0.0625 1 1111 0.25
1
d 0.0625 0 0111
0.25
1
c 0.125 0 011
0.375
1
b 0.25
0 01 0.5
a 0.5 0 0 0.5
AVERAGE CODEWORD LENGTH 1.875 bit/source
symbol !

14
Kraft inequality

Prefix codes with M code words satisfy the Kraft
inequality
where nk is the code word length for message k
Proof let nM be the longest codeword length
then, in a code tree of depth nM, the terminal
nodes eliminate
from the total number of available nodes

15
example
eliminates 2
eliminates 4
eliminates 8
Depth 4
Homework can we replace into in the Kraft
inequality?
16
Kraft inequality

Suppose that the length specification of M code
words satisfies the Kraft inequality,
then
where Ni is the number of code words of length
i.
Then, we can construct a prefix code with the
specified lengths.
Note that

17
Kraft inequality

From this,
Interpretation at every level less nodes used
than available!
E.g. for level 3, we have 8 nodes minus the nodes
cancelled by
Level 1 and 2.

18
performance

Suppose that we select the code word lengths as
Then, a prefix code exists, since
with average length

19
Lower bound for prefix codes

We show that
We write
Equality can be established for

20
Huffman coding (1)
The average codeword length
Property an optimal code has minimum
L Property for an optimal code the two least
probable codewords have the same length, are
the longest by manipulating the assignment
differ only in the last code digit
Homework proof
21
Huffman Coding optimality (2)
Given code C with average length L and M symbols
construct C ( For Cthe codewords for least
probable symbols differ in last digit
) 1. replace the 2 least probable symbols CM
and CM-1 in C by symbol CM-1 with
probability P(M-1) P(M) P(M-1) 2.
to minimize L, we have to minimize L.
22
Huffman Coding (JPEG, MPEG, MP3)

1 take together smallest probabilites
P(i) P(j)
2 replace symbol i and j by new symbol
3 go to 1 - until end
Example code
0.3 0.3 0.3 11
0.55
0.25 0.25 0.25 01
1.00
0.25 0.25 10 0.45 0.45
0.1 100
0.2
0.1

     000

23
Properties

ADVANTAGES
uniquely decodable code
smallest average codeword length
DISADVANTAGES
LARGE tables give complexity
variable word length
sensitive to channel errors

24
Conclusion Huffman

Tree coding (Huffman) is not universal!
it is only valid for one particular type of
source!
For COMPUTER DATA data reduction is
lossless? no errors at reproduction
universal? effective for different types of data

25
Some comments

The Huffman code is not unique, but efficiency is
the same!
For alphabets larger than 2 small modification
necessary (where?)

26
Performance Huffman

Using the probability distribution for the source
U, a prefix code exists with average length
L lt H(U) 1
Since Huffman is optimum, this bound is also true
for Huffman codes.
Problem if H(U) ? 0
Improvements can be made when we take J symbols
together, then
JH(U) L lt J H(U) 1
and
H(U) L L/J lt H(U) 1/J

27
Example
28
Example
s1 Pr(s1)0.1
s2 Pr(s2)0.25
s3 Pr(s3)0.2
s4 Pr(s4)0.45
0.3
0.55
1
29
Encoding idea Lempel Ziv Welch-LZW
Assume we have just read a segment w from the
text. a is the next symbol.
a
w

If wa is not in the dictionary,
Write the index of w in the output file.
Add wa to the dictionary, and set w ? a.

If wa is in the dictionary,
Process the next symbol with segment wa.

30
Encoding example

address 0 a address 1 b address 2 c
String
a a b a a c a b c a b c
b output update
a a aa not in dictionry, output 0 add aa
to dictionary 0 aa 3
a a b continue with a, store ab in dictionary
0 ab 4
a a b a continue with b, store ba in
dictionary 1 ba 5
a a b a a c aa in dictionary, aac not,
3 aac 6
a a b a a c a 2 ca 7
a a b a a c a b c 4 abc 8
a a b a a c a b c a b 7 cab 9

31
UNIVERSAL (LZW) (decoder)

Start with basic symbol set
2. Read a code c from the compressed file.
- The address c in the dictionary determines the
segment w.
- write w in the output file.
3. Add wa to the dictionary a is the first
letter of the next segment

32
Decoding example

address 0 a address 1 b address 2 c
String input update
a ? output a 0
a a ! output a determines ? a, update aa
0 aa 3
a a b . output 1 determines !b, update ab
1 ab 4
a a b a a . 3 ba 5
a a b a a c . 2 aac 6
a a b a a c a b . 4 ca
7
a a b a a c a b c a .
7 abc 8

33
Conclusion (LZW)

IDEA TRY to copy long parts of source output
if overflow
throw least-recently used entry away in en- and
decoder
universal
lossless

Homework encode/decode the sequence
1001010110011... Try to solve the problem that
occurs!
34
Some history

GIF, TIFF, V.42bis modem compression standard,
PostScript Level 2
1977 published by Abraham Lempel and Jakob Ziv
1984 LZ-Welch algorithm published in IEEE
Computer
Sperry patent transferred to Unisys (1986)
GIF file format Required use of LZW algorithm

35
references
J. Ziv and A. Lempel, A Universal Algorithm for
Sequential Data Compression, IEEE Transactions on
Information Theory, May 1977. Terry Welch, A
Technique for High-Performance Data Compression,
Computer, June 1984.
36
Summary of operations

ENCODING output update location
W1 A loc( W1 ) W1A
N
W2 F loc( W2 ) W2
F N1
W3 X
loc( W3 ) W3 X N2
DECODE INPUT update location
loc( W1 ) W1 ?
loc( W2 ) W2 ?
W1A N
loc( W3) W3 ? W2 F
N1

37
Problem and solution

ENCODING output update location
W1 A loc( W1 ) W1A N
W2 W1 A F loc( W2 ) W2 F
N1
DECODE INPUT update location
loc( W1 ) W1 ?
loc( W2 W1 A) W2
W1A N
Since W2 W1 A, the ? can be solved ? W2
updated at location N as W1A

38
Shannon-Fano coding
Suppose that we have a source with M symbols.
Every symbol ui occurs with probability
P(ui). We try to encode symbol ui with
bits Then the average representation length is
39
code realization
Define
40
continued
Define The codeword for ui is the binary
expansion for Q(ui) of length ni Property The
code is a prefix code with the promised
length Proof Let i ? k1
41
continued

The binary radix-2 representation for Q(ui) and
Q(uk) differ at least in position nk.
The codewords for Q(ui) and Q(uk) have length
The truncated representation for Q(uk) can never
be a prefix for the codeword ni.

42
example
P(u0 u1 u2 u3 u4 u5 u6 u7)(5/16, 3/16,1/8, 1/8,
3/32, 1/16, 1/16, 1/32)
43
Enumerative coding
suppose pn ones in long sequence of length
n. According to Shannon we need nh(p) bits
to represent every sequence How do we realize
the encoding and decoding?
44
Enumerative coding

Solution do lexicographical ordering
Example 2 ones in sequence of length 6
1 1 0 0 0 0
9 0 1 1 0 0 0
8 0 1 0 1 0 0
7 0 1 0 0 1 0
6 0 1 0 0 0 1
0 0 1 1 0 0
0 0 1 0 1 0
3 0 0 1 0 0 1
0 0 0 1 1 0
1 0 0 0 1 0 1
0 0 0 0 0 1 1

Encode Sequence of sequences with lower
lexicographical order Decode reconstruct
sequence with sequence
45
Enumerative encoding
Example index for sequence 0 1 0 1 0 0
8 0 1 0 1 0 0
There are 2 sequences with prefix 0 length 2
and with 1 one
There are 6 sequences with prefix 0 length 4
and with 2 ones
46
Enumerative decoding
Given sequence of length 6 with 2 ones What is
the sequence for index 8 ? There are 10
sequences with prefix 0, length 5 and 2
ones Hence, sequence starts with 0 There are 6
sequences with prefix 00, length 4 and 2 ones
Hence, sequence starts with 01 01 6 There
are 3 sequences with prefix 010, length 3 and 1
one Hence, sequence starts with 010 and not
011 010 6 There are 2 sequences with prefix
0100, length 2 and 1 one Hence, sequence starts
with 0101 010100 8
47
Enumerative encoding performance
The number of bits per n source outputs for pn
ones
Asymptotically Efficiency ? h(p) bits per source
output
Note added for words of length n, - encode
first the number of ones in a block with
log2(n1) bits, - then do the enumerative
encoding with h(p) bits per source output The
contribution (log2(n1))/n dissappears for large
n!
48
David A. Huffman
In 1951 David A. Huffman and his classmates in an
electrical engineering graduate course on
information theory were given the choice of a
term paper or a final exam. For the term paper,
Huffman's professor, Robert M. Fano, had assigned
what at first appeared to be a simple problem.
Students were asked to find the most efficient
method of representing numbers, letters or other
symbols using a binary code. Besides being a
nimble intellectual exercise, finding such a code
would enable information to be compressed for
transmission over a computer network or for
storage in a computer's memory. Huffman worked
on the problem for months, developing a number of
approaches, but none that he could prove to be
the most efficient. Finally, he despaired of ever
reaching a solution and decided to start studying
for the final. Just as he was throwing his notes
in the garbage, the solution came to him. "It
was the most singular moment of my life," Huffman
says. "There was the absolute lightning of
sudden realization."
49
The inventors
LZW (Lempel-Ziv-Welch) is an implementation of a
lossless data compression algorithm created by
Lempel and Ziv. It was published by Terry Welch
in 1984 as an improved version of the LZ78
dictionary coding algorithm developed by Abraham
Lempel and Jacob Ziv.
50
Intuitive Lempel Ziv (be careful !)

A source generates independent symbols 0 and 1
p(1) 1-p(0) p
Then
There are roughly 2nh(p) typical sequences,
every typical sequence has p(t) ? 2-nh(p)
We expect that in a binary sequence of lenth N
2nh(p) , every typical sequence occurs once
(with very high probability)

51
Intuitive Lempel Ziv (be careful !)

Idea for the Algorithm
Start with an initial sequence of length N
Generate a string of length n
( which is typical with high probability)
b. Transmit its starting position in the string
of length N with log2N bits
if not present, transmit the n bist as they
occur
c. Delete the first n bits of the initial
sequence and append the newly generated n bits.
Go back to a, unless end of the source sequence

52
Intuitive Lempel Ziv (be careful !)

EFFICIENCY the new n bits are typical with
probability 1 - ?, where ? ? 0
- if non typical, transmit 0, followed by the n
bits
- if typical, transmit 1, followed by log2N bits
for the position in the block
hence average bits/source output
? (1-?)(log2N)/n ? 1/n ? h(p) bits/source
output for large n and ? ? 0!
NOTE
- if p changes, we can adapt N and n, or choose
some worst case value in advance
- the typical words can also be stored in a
memory. The algorithm then outputs the location
of the new word. Every time a new word is entered
into the memory and one word is deleted.
Why is this not a good solution?