Modeling and Coding - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

Modeling and Coding

Description:

Describe the model and the residual (how the data differ from the model) ... Morse Code (1838) Shorter codes are assigned to letters that occur more frequently! ... – PowerPoint PPT presentation

Number of Views:59

Avg rating:3.0/5.0

Slides: 35

Provided by: meich3

Category:

more less

Transcript and Presenter's Notes

Title: Modeling and Coding

1
Modeling and Coding

Mei-Chen Yeh
09/25/2009

2
Announcement

No class on 10/2

3
Review
Compressed data
Reconstruction
Compression
Fewer bits!
Original
Reconstructed data

Codec Encoder Decoder
Lossless and Lossy

4
Two phases modeling and coding
Original
Compressed data
Encoder
Fewer bits!

Modeling
Discover the structure in the data
Extract information about any redundancy
Coding
Describe the model and the residual (how the data
differ from the model)

5
Example (1)

5 bits 12 samples 60 bits
Representation using fewer bits?

6
Example Modeling
7
Example Coding

-1, 0, 1
2 bits 12 samples 24 bits (compared with 60
bits before compression)

We use the model to predict the value, then
encode the residual!
8
Example (2)
9
Example (3)
106 bits
3 x 41 123 bits
10
Example (4)

Morse Code (1838)

Shorter codes are assigned to letters that occur
more frequently!
11
A Brief Introduction to Information Theory
12
Information Theory (1)

A quantitative measure of information
You will win the lottery tomorrow.
The sun will rise in the east tomorrow.
Self-information Shannon 1948
P(A) the probability that the event A will
happen

The amount of surprise or uncertainty in the
message
13
Information Theory (2)

For two independent events A, B
i(AB) i(A) i(B)
Example flipping a coin
If the coin is fair
P(H) P(T) ½
i(H) i(T) -log2(½) 1 bit
If the coin is not fair
P(H) 1/8, P(T)7/8
i(H) 3 bits, i(T) 0.193 bits
The occurrence of a HEAD conveys more
information!

14
Information Theory (3)

For a set of independent events Ai
Entropy (the average self-information)
The coin example
Fair coin (1/2, 1/2) HP(H)i(H) P(T)i(T) 1
Unfair coin (1/8, 7/8) H0.544
Bounds of H?

15
Information Theory (4)

A general source S
Alphabet A 1, 2, , m
Output sequence X1, X2, , Xn
Entropy
Suppose X1, X2, , Xn are independent and
identical distributed (iid)

First-order entropy
16
Information Theory (5)

Entropy (cont.)
The best a lossless compression scheme can do
Not possible to know for a physical source
Estimate!
Depends on our assumptions about the structure

17
Estimation of Entropy (1)

1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10
Assume the sequence is i.i.d.
P(1)P(6)P(7)P(10)1/16
P(2) P(3)P(4)P(5)P(8)P(9)2/16
H 3.25 bits
Assume sample-to-sample correlation exists
Model xn xn-1 rn
1 1 1 -1 1 1 1 -1 1 1 1 1 1 -1 1 1
P(1)13/16, P(-1)3/16
H 0.7 bits

18
Estimation of the Entropy (2)

1 2 1 2 3 3 3 3 1 2 3 3 3 3 1 2 3 3 1 2
One symbol at a time
P(1) P(2) ¼, P(3) ½
H 1.5 bits/symbol
30 (1.520) bits are required to represent the
sequence
In blocks of two
P(1 2) ½, P(3 3)½
H 1 bit/block
10 (110) bits are required

The theory says we can always extract the
structure of the data by taking larger block
sizes, but not practical.
19
Models
20
Models

Physical models
The physics of the data generation process
Too complicated
Probability models
For A a1, a2, , am, we have PP(a1), P(a2),
, P(am)
The independence assumption
Markov models
Represent dependence in the data

21
Markov Models (1)

k-th order model
The probability of next symbol depends on its
preceding k symbols.
first-order model
Example a binary image
Two states Sb (black pixel), Sw (white pixel)
State probabilities P(Sb), P(Sw)
Transition probabilities P(wb), P(bw), P(ww),
P(bb)

Model with the iid assumption
First-order Markov model

23
Coding
24
Coding (1)

The assignment of binary sequences to elements of
an alphabet
Rate of the code average number of bits per
symbol
Fixed-length code and variable-length code

25
Coding (2)
26
Coding (3)

Example of not uniquely decodable code

Letters Code a1 0 a2 1 a3 00 a4 11
100
a2 a3
a2 a1 a1
back
27
Coding (4)

Not instantaneous, but uniquely decodable code

Oops!
a2 a3 a3 a3 a3 a3 a3 a3 a3
28
A Test for Unique Decodability

Dangling suffix
a010, b01011
Dangling suffix is a codeword gt not uniquely
decodable

Code 1 0, 01, 11
Code 2 0, 01, 10
Uniquely decodable!
Not uniquely decodable! (0 1) (0) (0) (1 0)
29
Exercise

Uniquely decodable?

Letters Code a1 0 a2 001 a3 010 a4 100
0 0 1 0 0
a2 a1 a1 a1 a3 a1 a1 a1 a4
30
Prefix Codes

No codeword is a prefix to another codeword
Uniquely decodable

31
Coding (cont.)

For compression
Uniquely decodable
Short codewords
Instantaneous (easier to decode)
What sets of code lengths are possible?
Can we always use prefix codes?

32
The Kraft-McMillan Inequality (1)

There is a uniquely decodable code with codewords
having lengths l1, , lN if
A uniquely decodable code with lengths 1, 2, 3,
3, since ½ ¼ ? ? 1
ex 0, 01, 011, 111
No uniquely decodable code with lengths 2, 2, 2,
2, 2, since ¼ ¼ ¼ ¼ ¼ gt 1

33
The Kraft-McMillan Inequality (2)

Given l1, , lN that satisfy the inequality
we can always find a prefix code with codeword
lengths l1, , lN
There is a prefix code with lengths 1, 2, 3, 3,
since ½ ¼ ? ? 1
ex 0, 10, 110, 111
There is a prefix code with lengths 2, 2, 2,
since ¼ ¼ ¼ lt 1
ex 00, 10, 01

34
The Kraft-McMillan Inequality (3)

Combing both theories
There is a uniquely decodable code with length
l1, , lN that satisfies the inequality if and
only if there is a prefix code with these
lengths!
We can always use prefix codes ?
For any non-prefix uniquely decodable, we can
always find a prefix code with the same lengths

Write a Comment

User Comments (0)