Title: Modeling and Coding
1Modeling and Coding
2Announcement
3Review
Compressed data
Reconstruction
Compression
Fewer bits!
Original
Reconstructed data
- Codec Encoder Decoder
- Lossless and Lossy
4Two phases modeling and coding
Original
Compressed data
Encoder
Fewer bits!
- Modeling
- Discover the structure in the data
- Extract information about any redundancy
- Coding
- Describe the model and the residual (how the data
differ from the model)
5Example (1)
- 5 bits 12 samples 60 bits
- Representation using fewer bits?
6Example Modeling
7Example Coding
- -1, 0, 1
- 2 bits 12 samples 24 bits (compared with 60
bits before compression)
We use the model to predict the value, then
encode the residual!
8Example (2)
9Example (3)
106 bits
3 x 41 123 bits
10Example (4)
Shorter codes are assigned to letters that occur
more frequently!
11A Brief Introduction to Information Theory
12Information Theory (1)
- A quantitative measure of information
- You will win the lottery tomorrow.
- The sun will rise in the east tomorrow.
- Self-information Shannon 1948
- P(A) the probability that the event A will
happen
The amount of surprise or uncertainty in the
message
13Information Theory (2)
- For two independent events A, B
- i(AB) i(A) i(B)
- Example flipping a coin
- If the coin is fair
- P(H) P(T) ½
- i(H) i(T) -log2(½) 1 bit
- If the coin is not fair
- P(H) 1/8, P(T)7/8
- i(H) 3 bits, i(T) 0.193 bits
- The occurrence of a HEAD conveys more
information!
14Information Theory (3)
- For a set of independent events Ai
- Entropy (the average self-information)
- The coin example
- Fair coin (1/2, 1/2) HP(H)i(H) P(T)i(T) 1
- Unfair coin (1/8, 7/8) H0.544
- Bounds of H?
15Information Theory (4)
- A general source S
- Alphabet A 1, 2, , m
- Output sequence X1, X2, , Xn
- Entropy
- Suppose X1, X2, , Xn are independent and
identical distributed (iid)
First-order entropy
16Information Theory (5)
- Entropy (cont.)
- The best a lossless compression scheme can do
- Not possible to know for a physical source
- Estimate!
- Depends on our assumptions about the structure
17Estimation of Entropy (1)
- 1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10
- Assume the sequence is i.i.d.
- P(1)P(6)P(7)P(10)1/16
- P(2) P(3)P(4)P(5)P(8)P(9)2/16
- H 3.25 bits
- Assume sample-to-sample correlation exists
- Model xn xn-1 rn
- 1 1 1 -1 1 1 1 -1 1 1 1 1 1 -1 1 1
- P(1)13/16, P(-1)3/16
- H 0.7 bits
18Estimation of the Entropy (2)
- 1 2 1 2 3 3 3 3 1 2 3 3 3 3 1 2 3 3 1 2
- One symbol at a time
- P(1) P(2) ¼, P(3) ½
- H 1.5 bits/symbol
- 30 (1.520) bits are required to represent the
sequence - In blocks of two
- P(1 2) ½, P(3 3)½
- H 1 bit/block
- 10 (110) bits are required
The theory says we can always extract the
structure of the data by taking larger block
sizes, but not practical.
19Models
20Models
- Physical models
- The physics of the data generation process
- Too complicated
- Probability models
- For A a1, a2, , am, we have PP(a1), P(a2),
, P(am) - The independence assumption
- Markov models
- Represent dependence in the data
21Markov Models (1)
- k-th order model
- The probability of next symbol depends on its
preceding k symbols. - first-order model
- Example a binary image
- Two states Sb (black pixel), Sw (white pixel)
- State probabilities P(Sb), P(Sw)
- Transition probabilities P(wb), P(bw), P(ww),
P(bb)
22- Model with the iid assumption
- First-order Markov model
23Coding
24Coding (1)
- The assignment of binary sequences to elements of
an alphabet - Rate of the code average number of bits per
symbol - Fixed-length code and variable-length code
25Coding (2)
26Coding (3)
- Example of not uniquely decodable code
Letters Code a1 0 a2 1 a3 00 a4 11
100
a2 a3
a2 a1 a1
back
27Coding (4)
- Not instantaneous, but uniquely decodable code
Oops!
a2 a3 a3 a3 a3 a3 a3 a3 a3
28A Test for Unique Decodability
- Dangling suffix
- a010, b01011
- Dangling suffix is a codeword gt not uniquely
decodable
Code 1 0, 01, 11
Code 2 0, 01, 10
Uniquely decodable!
Not uniquely decodable! (0 1) (0) (0) (1 0)
29Exercise
Letters Code a1 0 a2 001 a3 010 a4 100
0 0 1 0 0
a2 a1 a1 a1 a3 a1 a1 a1 a4
30Prefix Codes
- No codeword is a prefix to another codeword
- Uniquely decodable
31Coding (cont.)
- For compression
- Uniquely decodable
- Short codewords
- Instantaneous (easier to decode)
- What sets of code lengths are possible?
- Can we always use prefix codes?
-
32The Kraft-McMillan Inequality (1)
- There is a uniquely decodable code with codewords
having lengths l1, , lN if - A uniquely decodable code with lengths 1, 2, 3,
3, since ½ ¼ ? ? 1 - ex 0, 01, 011, 111
- No uniquely decodable code with lengths 2, 2, 2,
2, 2, since ¼ ¼ ¼ ¼ ¼ gt 1
33The Kraft-McMillan Inequality (2)
- Given l1, , lN that satisfy the inequality
- we can always find a prefix code with codeword
lengths l1, , lN - There is a prefix code with lengths 1, 2, 3, 3,
since ½ ¼ ? ? 1 - ex 0, 10, 110, 111
- There is a prefix code with lengths 2, 2, 2,
since ¼ ¼ ¼ lt 1 - ex 00, 10, 01
34The Kraft-McMillan Inequality (3)
- Combing both theories
- There is a uniquely decodable code with length
l1, , lN that satisfies the inequality if and
only if there is a prefix code with these
lengths! - We can always use prefix codes ?
- For any non-prefix uniquely decodable, we can
always find a prefix code with the same lengths