Title: Variable-Length Codes: Huffman Codes
1Chapter 4
- Variable-Length Codes Huffman Codes
2Outline
- 4.1 Introduction
- 4.2 Unique Decoding
- 4.3 Instantaneous Codes
- 4.4 Construction of Instantaneous Codes
- 4.5 The Kraft Inequality
- 4.6 Huffman Codes
34.1 Introduction
- Consider the problem of efficient coding of
message to be sent over a noiseless channel. - maximize the number of messages that can be sent
in a given period of time. - transmit a message in the shortest possible time.
- make the codeword as short as possible.
44.2 Unique Decoding
- Source symbols (alphabet) s1, . . . , sq
- Codes alphabet C1, C2, . . . , Cr
- X is a random variable
- X? s1, . . . , sq with probabilities p1, . .
. , pq - X is observed over and over again, i.e., it
generates a sequence of symbols s1, . . .
, sq - Ex s1 ? 000
- s2 ? 111
encode
Si
Ci Cj Ck
5- The collection of all codewords is called a
code. - Our Objective
- Minimize the average codeword length
-
- Unique decodability - the received message must
have a single, unique possible interpretation. - Ex. s1 ? 0 Source
alphabets1, s2, s3, s4 - s2 ? 01 Code
alphabet0,1 - s3 ? 11
- s4 ? 00
- Then 0011
- So it doesnt satisfy unique
decodability
s4 s3
s1 s1 s3
6- Ex
- s1 ? 0
- s2 ? 010 Then 010
- s3 ? 01
- s4 ? 10 It also doesnt satisfy
unique decodability - Ex
- s1 ? 0
- s2 ? 01
- s3 ? 011 It is a unique decodable
code - s4 ? 111
s1 s4
s2
s3 s1
7- Definition
- The nth extension of a code is simple all
possible concatenations of n symbols of the
original source code. - No two encoded concatenations can be the same,
even for different extensions. - Every finite sequence of code characters
corresponds at most one message. - every distinct sequence of source symbols has a
corresponding encoded sequence that is unique.
84.3 Instantaneous Codes
s1 0 s2 10 s3 110 s4 111
s1
0
Initial state
s2
0
1
s3
0
1
1
s4
9- Note that each bit of the received stream is
examined only once and that the terminal states
of this tree are the four source symbols s1, s2,
s3 and s4. - Definition A code is instantaneous if it is
decodable without lookahead (i.e., a word can be
recognized as soon as complete). - When a complete symbol is received, the receiver
immediately know this, and do not have to look
further before deciding what message symbol you
received. - A code is instantaneous iff no codeword si is a
prefix of another codeword sj.
10- the existence of the decoding tree
- the existence of the instantaneous
decodability - Ex. Let n be a positive integer. A comma code is
a code with codewords - 1 becomes a comma to represent end of a
codeword. - Because a comma code is prefix-free, it is a
instantaneous code.
1, 01, 001, 0001, . . . , 0001, 000
n-1
n
11- Ex
- ex 01111111
- So it had better use comma code.
s1 ? 0 s2 ? 01 s3 ? 011 s4 ? 111
Not instantaneous code, but it still be uniquely
decodable code.
U.D.
I.C. is better than U.D.
I.C.
s1 ? 1 s2 ? 01 s3 ? 001 s4 ? 001
s4
s4
124.4 Construction of Instantaneous Codes
- Given five symbols si in the source code S.
- Both C1 and C2 are Instantaneous Codes, which
one is better? - Answer Depends on the frequency of
occurrence of the symbols
s1 ? 0 s2 ? 10 s3 ? 110 s4 ? 1110 s5 ? 1111
s1 ? 00 s2 ? 01 s3 ? 10 s4 ? 110 s5 ? 111
C1
C2
134.5 Kraft Inequality
- Theorem A necessary and sufficient condition for
the existence of an instantaneous code S of q
symbols si - (i 1, .., q) with encoded words of length
- l1 ? l2 ? ? lq is where r is the
radix (number of symbols) of the alphabet of the
encoded symbols.
14- Thm An instantaneous code with word length n1 ,
n2, . . ., nM exits iff where D is
the size of the code alphabet. (?) For
simplicity, we assume D 2(1) when H 1, n1
1 and n2 1 s1?0 s2?1
s1
0
is OK for tree of length 1
1
s2
15- (2)If H ? h is OK,
- then k ? 1 k ? 1when H h 1,
By induction method, the inequality is
true.
K
k
K
16- Another proof(?) C c1, c2, , cM with
codeword lengths l1, , lMLet L max li
Ifwhere yj are any code symbols, cannot be in
C because ci is a prefix of x. x has
possibilities. -
words (length of L) not in C -
17- gt If there are ?1 number of words with length
1 then ?1? r. - If there are ?2 number of words with
length 2 then - (?2 ? r2 - ?1r). Infer that, ?3 ?
r3 - ?1r2 - ?2r. gt ?1 ? r ?1r ?2 ? r2
?1r2 ?2r ?3 ? r3
?1rn-1 ?2rn-2 ?n ? rnSo if it satisfy the
last equation, then all the equations hold.
gt It satisfy Krafts inequality.
18- Note A code may obey Kraft inequality still
not be instantaneous.EX 0 01
011 111EX Binary Block codes
(Error Correcting Codes) ( )
( )
But it is not I.C.
n
k
b 2
19- Ex Comma code D
.length 1 1 (It must to
be.)length 2 D-1length 3
(D-1)2length k
D(D-1)k-1 - Kraft inequality can be extended to any uniquely
decodable codes.
20- McMillan InequalityThm A uniquely decodable
code has word length l1, l2, , lq exits iff (r
is the size of the code alphabet) - (?)Trivial. Because I.C. is one kind of
U.C.(?)where l is the length of the longest
symbol, i.e., and Nk
is the number of code symbols (of radix r) of
length k.
21- (the number of distinct
sequences of length k in radix r)If k gt 1, we
can find a n s.t. kn gt nl ??
224.6 Huffman Codes
- Lemma If a code C is optimal within the class of
instantaneous codes, then C is optimal with the
entire class of U.D. codes. - pf Suppose C is a U.D. code. C has a smaller
average codeword length than C. - Let n1, n2 , . . . , nM be the codeword
length of C - So, C is not optimal in I.C. ??
(It satisfy Kraft Inequality)
23- Optimal Codes
- Given a binary I.C. C with codeword length n1,
, nM associated with probability p1, , pM. - For convenience, let p1 p2 pM-1 pM
- (ni ni1 nir if pi pi1
pir) - Then C is optimal within the class of I.C., C
must have the following properties
24- (a) Higher probable symbols have shorter
codewords. - i.e. if pj gt pk gt nj nk
- (b) The 2 least probable symbols have codewords
of equal length, i.e., nM-1 nM - (c) Among the codewords of length nM, 2 codes
the agree in all digits except the least one. - Ex
x1 ? 0 x2 ? 100 x3 ? 101 x4 ? 1101 x5 ? 1110
Dont satisfy (c), it have to be
x4 ? 1101 x5 ? 1100
25- pf
- (a) if ( pj gt pk ) ( nj gt nk ) then we
can construct a better codes C by interchange
codewords j, k. - (b) From (a) if pM-1 gt pM then nM-1 nM By
assumption if pM-1 pM then nM-1 nM .We may
make nM-1 nM and still have in I.C. better than
the original one. - (c) If condition (c) is not true, we may drop
the least digit of all such codewords to obtain a
better code. - Huffman coding- Construction of Optimal
(instantaneous) codes -
26- Let x1, , xM be an array of symbols with
probabilities p1, , pM ( p1 p2 pM) - (1) Combine xM-1, xM into xM-1,M with
probability pM-1pM - (2) Assume we can construct an O.C. C2 for
x1, x2, , xM-1,M - (3) Now construct a code C1 for x1, , xM as
follows - The codeword associated with x1, , xM-2 in C1 is
exactly the same as the corresponding codewords
of C2 - Let wM-1,M be the codeword of xM-1,M in C2
- The codewords for xM-1, xM in C1 is
- either wM-1,M 0 ? xM-1 or
- wM-1,M 1 ? xM
27- Claim C1 is an optimal code for the set of
probability p1, , pM. - Ex
x3,4,5,6 0.45 x1 0.3 x2 0.25
x1,2 0.55 x3,4,5,6 0.45
x1 0.3 x2 0.25 x3 0.2 x4 0.1 x5 0.1 x6 0.05
x1 0.3 x2 0.25 x3 0.2 x5,6 0.15 x4 0.1
x1 0.3 x2 0.25 x4,5,6 0.25 x3 0.2
28x1 00 x2 01
x1,2 0 x3,4,5,6 1
x3 10 x4,5,6 11
x3,4,5,6 1
x4 110 x5,6 111
x5 1110 x6 1111
29- pf
- We assume that C1 is not optimal.
- Let C1 be an optimal instantaneous code for x1,
, xM. - Then C1 has codewords w1, w2, , wM with
length n1, n2, , nM. - If there are only two symbols of maximum length
in a tree, they must have their last decision
node in common, and they must be the two least
probable symbols. Before we reduce a tree, the
two symbols contribute nM( pM pM-1) and after
the reduction they contribute (nM - 1)( pM
pM-1). - So that the code length is reduced by ( pM
pM-1). - Average length of C1 gt Average length of C1
--- (1) - After reduction,
- Average length of C2 gt Average length of C2
- (The terms of (1) minus pMpM-1)
- But C2 is optimal ??
30- If there are more than two symbols of the maximum
length, we can use the following proposition - Symbols having the same length may be
inter-change without changing the average code
length. - We can use the biggest two probable symbols to
encode like the way before. - Huffman encoding is not unique.
31p1 0.4 ? 00 p2 0.2 ? 10 p3 0.2 ? 11 p4
0.1 ? 010 p5 0.1 ? 011
Average length L 0.420.220.220.130.1
3 2.2
p1 0.4 ? 1 p2 0.2 ? 01 p3 0.2 ? 000 p4
0.1 ? 0010 p5 0.1 ? 0011
Or
Average length L 0.410.220.230.140.1
4 2.2
32- Which encoding way is better?
- Var( I ) 0.4(2-2.2)2 0.2(2-2.2)2
0.2(2-2.2)2 0.1(3-2.2)2 0.1(3-2.2)2 0.16
(Good!) - Var( II ) 0.4(1-2.2)2 0.2(2-2.2)2
- 0.2(3-2.2)2 0.1(4-2.2)2 0.1(4-2.2)2
1.36