Lossless%20Compression%20-Statistical%20Model%20Part%20II%20%20Arithmetic%20Coding - PowerPoint PPT Presentation

About This Presentation

Title:

Lossless%20Compression%20-Statistical%20Model%20Part%20II%20%20Arithmetic%20Coding

Description:

Lossless Compression -Statistical Model Part II Arithmetic Coding – PowerPoint PPT presentation

Number of Views:252

Avg rating:3.0/5.0

Slides: 39

Provided by: MsWi1

Category:

more less

Transcript and Presenter's Notes

Title: Lossless%20Compression%20-Statistical%20Model%20Part%20II%20%20Arithmetic%20Coding

1
Lossless Compression -Statistical ModelPart II
Arithmetic Coding
2
CONTENTS

Introduction to Arithmetic Coding
Arithmetic Coding Decoding Algorithm
Generating a Binary Code for Arithmetic Coding
Higher-order and Adaptive Modeling
Applications of Arithmetic Coding

3
Arithmetic Coding

Huffman codes have to be an integral number of
bits long, while the entropy value of a symbol is
almost always a faction number, theoretical
possible compressed message cannot be achieved.
For example, if a statistical method assign 90
probability to a given character, the optimal
code size would be 0.15 bits.

4
Arithmetic Coding

Arithmetic coding bypasses the idea of replacing
an input symbol with a specific code. It replaces
a stream of input symbols with a single
floating-point output number.
Arithmetic coding is especially useful when
dealing with sources with small alphabets, such
as binary sources, and alphabets with highly
skewed probabilities.

5
Arithmetic Coding Example (1)
Character probability Range (space)
1/10 A 1/10
B 1/10 E 1/10 G
1/10 I 1/10 L
2/10 S 1/10 T
1/10
Suppose that we want to encode the message BILL
GATES
6
Arithmetic Coding Example (1)
0.2572
0.2
0.0
0.25
0.256

0.1
0.25724
A
0.2
B
0.3
E
0.4
G
0.5
0.25
I
I
0.6
0.26
0.2572
0.256
L
L
L
0.258
0.8
0.2576
S
0.9
T
0.26
0.258
1.0
0.3
0.2576
7
Arithmetic Coding Example (1)

New character Low value high
value
B 0.2 0.3
I 0.25 0.26
L 0.256 0.258
L 0.2572 0.2576
(space) 0.25720 0.25724
G 0.257216 0.257220
A 0.2572164 0.2572168
T 0.25721676 0.2572168
E 0.257216772 0.257216776
S 0.2572167752
0.2572167756

8
Arithmetic Coding Example (1)

The final value, named a tag, 0.2572167752 will
uniquely encode the message BILL GATES.
Any value between 0.2572167752 and 0.2572167756
can be a tag for the encoded message, and can be
uniquely decoded.

9
Arithmetic Coding

Encoding algorithm for arithmetic coding.
Low 0.0 high 1.0
while not EOF do
range high - low read(c)
high low range?high_range(c)
low low range?low_range(c)
enddo
output(low)

10
Arithmetic Coding

Decoding is the inverse process.
Since 0.2572167752 falls between 0.2 and 0.3, the
first character must be B.
Removing the effect of B from 0.2572167752 by
first subtracting the low value of B, 0.2, giving
0.0572167752.
Then divided by the width of the range of B,
0.1. This gives a value of 0.572167752.

11
Arithmetic Coding

Then calculate where that lands, which is in the
range of the next letter, I.
The process repeats until 0 or the known length
of the message is reached.

12
r c Low High range 0.2572167752
B 0.2 0.3 0.1 0.572167752
I 0.5 0.6 0.1 0.72167752
L 0.6 0.8 0.2 0.6083876 L
0.6 0.8 0.2 0.041938 (space)
0.0 0.1 0.1 0.41938 G 0.4
0.5 0.1 0.1938 A 0.2 0.3
0.1 0.938 T 0.9 1.0 0.1
0.38 E 0.3 0.4 0.1 0.8
S 0.8 0.9 0.1 0.0
13
Arithmetic Coding

Decoding algorithm
r input_code
repeat
search c such that r falls in its range
output(c)
r r - low_range(c)
r r/(high_range(c) - low_range(c))
until r equal 0

14
Arithmetic Coding Example (2)
Suppose that we want to encode the message 1 3 2 1
15
Arithmetic Coding Example (2)
0.00
0.00
0.7712
0.656
0.7712
1
1
0.7712
0.773504
0.80
2
2
0.82
0.656
0.77408
3
3
1.00
0.773504
0.77408
0.80
0.80
16
Arithmetic Coding Example (2)
Encoding
New character Low value High
value 0.0
1.0 1 0.0
0.8 3 0.656 0.800 2
0.7712 0.77408 1 0.7712
0.773504
17
Arithmetic Coding Example (2)
Decoding
r c low high range
0.772352 1 0 0.8 0.8 (0.772352-0)/0.80.96544
0.96544 3 0.82 1.0 0.18 (0.96544-0.82) / 0.180.808
0.808 2 0.8 0.82 0.02 (0.808-0.8)/0.020.4
0.4 1 0 0.8
18
Arithmetic Coding

In summary, the encoding process is simply one of
narrowing the range of possible numbers with
every new symbol.
The new range is proportional to the predefined
probability attached to that symbol.
Decoding is the inverse procedure, in which the
range is expanded in proportion to the
probability of each symbol as it is extracted.

19
Arithmetic Coding

Coding rate approaches high-order entropy
theoretically.
Not so popular as Huffman coding because ?, ? are
needed.
Average bits/byte on 14 files (program, object,
text, and etc.)
Huff. LZW LZ77/LZ78 Arithmetic
4.99 4.71 2.95 2.48

20
Generating a Binary Code forArithmetic Coding

Problem
The binary representation of some of the
generated floating point values (tags) would be
infinitely long.
We need increasing precision as the length of the
sequence increases.
Solution
Synchronized rescaling and incremental encoding.

21
Generating a Binary Code forArithmetic Coding

If the upper bound and the lower bound of the
interval are both less than 0.5, then rescaling
the interval and transmitting a 0 bit.
If the upper bound and the lower bound of the
interval are both greater than 0.5, then
rescaling the interval and transmitting a 1
bit.
Mapping rules

22
Arithmetic Coding Example (2)
0.00
0.00
0.3568
0.312
0.3568
0.0848
0.1696
0.6784
1
0.3392
0.312
1
0.09632
0.19264
0.38528
0.77056
0.5424
0.38528
0.80
2
2
0.54112
0.82
0.656
0.54812
0.6
3
3
1.00
0.80
0.6
0.504256
23
Encoding
Any binary value between lower or upper.
24

Decoding the bit stream start with 1100011
The number of bits to distinct the different
symbol is bits.

25
Higher-order and Adaptive Modeling

To have a good compression ratio results in the
statistical model compression methods, the model
should be
Accurately predicts the frequency/ probability of
symbols in the data stream.
A non-uniform distribution
The finite context modeling provide a better
prediction ability.

26
Higher-order and Adaptive Modeling

Finite context modeling
Calculate the probabilities for each incoming
symbol based on the context (???) in which the
symbol appears.
e.g.
The order of the model refers to the number of
previous symbols that make up the context.
e.g.
In information theory, this type of finite
context modeling is called Markov process/system.

27
Higher-order and Adaptive Modeling

Problem
As the order of the model increases linearly, the
memory consumed by the model increases
exponentially.
e.g. for q symbols and order k, the table size
will be qk.
Solution
Adaptive modeling

28
Higher-order and Adaptive Modeling

Adaptive modeling
In adaptive data compression, both the compressor
and decompressor start with the same model.
The compressor encodes a symbol using the
existing model, then it updates the model to
account for the new symbol.
The decompressor likewise decodes a symbol using
the existing model, then it updates the model.

29
Higher-order and Adaptive Modeling

Adaptive data compression has a slight
disadvantage in that it starts compressing with
less than optimal statistics.
By subtracting the cost of transmitting the
statistics with the compressed data, however, an
adaptive algorithm will usually perform better
than a fixed statistical model.
Adaptive compression also suffers in the cost of
updating the model.

30
Higher-order and Adaptive Modeling

Encoding phase
low 0.0 high 1.0
while not EOF do
read(c)
range high - low
high low range high_
range(context,c)
low low range low_
range(context,c)
update_model(context,c)
context c
enddo
output(low)

31
Higher-order and Adaptive Modeling

Instead of just having a single context table, we
now have a set of q context tables.
Every symbol is encoded using the context table
from the previously seen symbol, and only the
statistics for the selected context get updated
after the symbol is seen.

32
Higher-order and Adaptive Modeling

Decoding phase
r input_code
repeat
search c from context_table context s.t. r
falls in its range
output(c)
range high_ range(context,c) - low_
range(context,c)
r r - low_ range(context,c)
r r/ range
update_model(context,c)
context c
until r equal 0.

33
ApplicationsThe JBIG Standard

JBIG --- Joint Bi-Level Image Processing Group
JBIG was issued in 1993 by ISO/IEC for the
progressive lossless compression of binary and
low-precision gray-level images (typically,
having less than 6 bits/pixel).
The major advantages of JBIG over other existing
standards are its capability of progressive
encoding and its superior compression efficiency.

34
The JBIG StandardContext-based arithmetic coder

The core of JBIG is an adaptive context-based
arithmetic coder.
If the probability of encountering a black pixel
p is 0.2 and the probability of encountering a
white pixel q is 0.8.
Using a single arithmetic coder, the entropy is

35
The JBIG Standard Context-based arithmetic coder

Group the data into Set A (80) and Set B (20),
using two coders
pw 0.95, pb 0.05, HA 0.286
pw 0.3, pb 0.7, HB 0.881,
then, the average H HA .8HB .2 0.405.
The number of possible patterns is 1024. The JBIG
coder uses 1024 or 4096 coders

36
Experimental Results
37
Experimental Results
38
Conclusions

Compression-ratio tests show that statistical
modeling can perform at least as well as
dictionary - based methods. But the high order
programs are at present somewhat impractical
because of their resource requirements.
JPEG, MPEG-1/2 uses Huffman and arithmetic coding
preprocessed by DPCM
JPEG-LS
JPEG2000, MPEG-4 uses arithmetic coding only
Order-3 the best performance for Unix.