Recent Advances in ErrorErasure Correcting and Coding presentation

About This Presentation

Transcript and Presenter's Notes

Title: Recent Advances in ErrorErasure Correcting and Coding

1
Recent Advances in Error/Erasure Correcting and
Coding

Vijay Subramanian

2
Overview of Presentation

Background on Error Correcting Codes.
Some Powerful Codes
Reed-Solomon Codes
Low-Density Parity Check (LDPC) Codes
(Rediscovered)
Tornado Codes
Digital Fountain Approach
Fountain Codes
LT Codes
Raptor Codes
Comparisons of Different Codes
Applications

3
Block Codes

Accepts a k-bit information symbol and transforms
it into a n-bit symbol.
It is a fixed length code
Must be implemented using combinatorial logic
circuit.

4
Some Important Types of Block Codes

Linear Block Codes
Sum of any 2 code words results in a third unique
codeword.
Systematic Code
The data bits also are present in the generated
codeword.
BCH
Generalization of Hamming code for multiple error
correction.
Very special class of linear codes known as Goppa
codes.
Cyclic Codes
Important subclass of linear block codes where
encoding and decoding can be implemented easily.
Cyclic shift of a code word yields another code
word.
Group Codes
Same as linear block codes. Also known as
generalized parity check codes.

5
Convolution Codes

Generates n-bit message block from a k-bit block.
Different from block codes in that encoder has
memory order m or constraint length.
Must be implemented using sequential logic
circuit.
Decoders have high complexity (decoding
operations) per output bit.
More suited for low-speed digitized voice traffic
(lower integrity) than for high-speed data
needing high integrity.
If the decoder loses or makes a mistake, errors
will propagate.
Performance depends on the constraint length.

6
Concatenated Codes

A concatenated code uses two levels of coding
An inner code and an outer code to obtain desired
performance.
Inner code is designed to correct most of the
channel errors.
The outer code is a higher rate (lower
redundancy) code which reduces the probability of
error to a specified level.
Concatenated codes are effective against a
mixture of random errors and bursts.

7
Concatenated Codes

Having 2 steps reduces the complexity required to
obtain desired error rate (compared to a single
code)
Outer code is typically RS code.
Inner decoder uses all the available soft
decision data to provide the best performance
under Gaussian conditions.
For a one-level concatenated code, if the minimum
distances of the inner and outer codes are d1 and
d2, then the minimum distance of the
concatenation is at least d1d2.
Choice of the inner code is dictated by the
application
For high data rates, inner code should be a block
code
For predominantly slower Gaussian channels, the
inner code should be a convolution code.

8
Turbo Codes

Important class of convolution codes proposed in
1993
Achieves large coding gains by combining two or
more relatively simple building blocks (component
codes).
A serial concatenation of codes is often used for
power limited systems.
A popular approach is to have a RS outer code
(applied first, removed last) with an inner code
(applied last, removed first).
Of all practical error correction methods known
to date, turbo codes, together with LDPC codes,
come closest to approaching the Shannon limit,
the theoretical limit of maximum information
transfer rate over a noisy channel.
Turbo codes make it possible to increase
available bandwidth without increasing the power
of a transmission, or they can be used to
decrease the amount of power used to transmit at
a certain data rate.
Main drawbacks are the relative high decoding
complexity and a relatively high latency, which
makes it unsuitable for some applications

9
Block Codes versus Convoluted Codes

For convolution codes, decoder complexity
increases as the redundancy decreases whereas for
block codes, complexity decreases as the
redundancy decreases. For high code rates, this
favors block codes.
Practically, the speed of convolution codes is
limited compared to block codes. On the other
hand, fast and powerful RS block decoders have
been built to operate at rates above 120Mb/s.
Convolution Codes are weak while RS codes are
superior when it comes to burst errors.

10
RS-code versus Convolution Codes

Which is better?
For digitized voice, concatenated code is better.
For high speed/high integrity, RS/block codes are
better.

11
Erasure Correcting Approaches

Gallager Codes 1962, Rediscovered as LDPC
RS Codes 1960, Reed and Solomon
Complexity in Theory and encoding/decoding
algorithms (Finite Field Operations)
Tornado Codes 1997, Luby et al.
Naïve and simple linear equation
encoding/decoding algorithm.
Complexity and theory in design and analysis of
irregular graph structure.
LT Codes 1998, Luby
Simpler scalable irregular graph structure.
Simpler analysis and design
Independent generation of each encoding symbol
Unlimited number of encoding symbols i.e.
rateless codes
First practical realization of a digital
fountain.
Raptor Codes 2001, Shokrollahi
Even better practical realization of a DF.

12
Reed-Solomon Codes

Reed-Solomon codes are special and widely
implemented because they are "almost perfect" in
the sense that the extra (redundant) letters
added on by the encoder is at a minimum for any
level of error correction, so that no bits are
wasted.
RS codes are easier to decode than most other
non-binary codes.
RS codes allow the correction of erasures. RS
codes have a combined error and erasure
correction capacity of 2ts lt n-k
RS codes can correct twice as many erasures as
errors (where the decoder has no information
about the error).
RS codes exist on b bit symbols for every value
of b.

13
RS Codes

Nonbinary cyclic codes with m-bit sequences where
m gt 2.
RS(n,k) codes n m bit symbols exist for all n and
k for which 0 lt k lt n lt 2m 2
RS codes achieve the largest possible code
minimum distance for any linear code with the
same encoder input and output block lengths.
RS codes are particularly useful for burst-error
correction (for channels that have no memory.)
Any linear code is capable of correcting n-k
symbol erasure patterns is the n-k erased symbols
all happen to lie on the parity symbols.
However, RS codes have the remarkable property
that they can correct any set of n-k symbol
erasures in the block.
RS codes can be designed to have any redundancy.

14
Properties of RS codes

Special case of BCH codes.
Excellent erasure correcting property.
They are popular because they can combat
combinations of both random and burst errors.
They also have a long block length, assuring a
sharp performance curve.
RS codes exhibit a very sharp improvement of
block error-rate with an improvement of channel
quality making their use ideal for data.
Since they cannot exploit soft decision data
well, they are used in concatenated codes.
Real number arithmetic is avoided since they
operate directly on bits.

15
RS code Performance (Error Correction No
Erasures)
16
Tornado Codes

Class of erasure codes that support
error-correcting and have fast encoding and
decoding algorithms.
Software-based implementations of Tornado codes
are about 100 times faster on small lengths and
about 10,000 times faster on larger lengths than
software-based Reed-Solomon erasure codes while
having only slightly worse overhead.
Tornado codes are fixed rate, near optimal
erasure correcting codes
Use sparse bipartite graphs to trade encoding and
decoding speed for reception overhead.
Since the introduction of Tornado codes, many
other similar codes have emerged, most notably
Online codes, LT codes and Raptor codes. The are
based on sparse graphs.
Tornado codes are block erasure codes that have
linear in n encoding and decoding times.
They are not rateless.
Running times for encoding and decoding
algorithms are proportional to their block
length. This makes it slow for small rates.

17
Structure of Tornado Codes

Each column acts as redundancy for the column on
its left.
Redundant packets are generated by taking the
EXOR of the packets to the left.
Tornado codes have the erasure property that to
recover k source packets we need slightly more
than k encoding packets.

18
Tornado Codes versus RS Codes

Advantage of Tornado codes over standard codes is
that they trade off a slight increase in
reception overhead for decreased encoding and
decoding times.
The algorithm runs in real time and file is
reconstructed as soon as enough packets arrive.
Source and Sink need to agree on the graph a
priori.

19
Digital Fountain

Digital Fountain
Key property of a digital fountain is that the
source data can be reconstructed from any subset
of the encoding packets equal in total length to
the source data.
The client can reconstruct the data using any k
of the encoded packets.
Ideally, the encoding and decoding processes
require very little processing overhead.

20
Digital Fountain

A DF is conceptually simpler, more efficient and
applicable to a broader class of networks than
previous approaches.
The key property of a DF is that the source data
can be reconstructed intact from any subset of
the encoding packets equal in length to the
source data.
RS codes had unacceptably high running times
which led to the development of Tornado codes.
DF codes are record-breaking sparse graph codes
for channels with erasures.
Retransmissions are needless as per Shannons
theory.
RS codes work only for small k,n and failure
probability need to be determined beforehand.

21
Properties of DF Codes

Required Encoding only as long as data.
Rateless implies that number of encoding packets
that can be generated is potentially limitless.
Can produce an unlimited flow of encoding
Data is always recoverable from required
encoding.
Low complexity for encoding and decoding.
Can encode large amount of data.
Ideal under conditions
High/ unknown / Variable loss
Large RTT
Intermittent Connectivity
One way channels
Can be used as background data transport

22
Properties of DF Codes

Every packet is useful.
Can serve a large number of clients without need
to maintain state.
One user with poor connection cannot affect
others.
Advantages are pronounced when we apply this to
broadcast/multicast.

23
DF Encoder

This encoding defines a graph connecting source
packets and encoding packets.
If the mean degree d is smaller than K then the
graph is sparse.
The decoder needs to know the degree of each
received packet and the source packets it is
connected to.

24
LT-code Decoding Process

Initially, all the input symbols are uncovered.
Step 1 All encoding symbols with 1 neighbor are
released to cover their unique neighbor.
The set of covered input symbols not yet
processed is called the ripple.
In each step, 1 input symbol from the ripple is
processed.
The process ends when the ripple is empty.
The process fails if there is at least one
uncovered input symbol at the end.
Process succeeds if all the input symbols are
covered at the end.
Goal of the degree distribution design is to
slowly release encoding symbols while keeping the
ripple small while ensuring that it does not
disappear.

25
DF Decoder
26
LT Codes

Rateless erasure codes
LT Codes are universal in the sense that they
Are near optimal for every erasure channel
And are very efficient as the data length grows.
Close to the minimum number of encoded symbols
can be generated and sent to the receivers
Number of packets receivers need to generate the
original data is also close to the minimum.

27
LT Codes

Objectives
As few encoding symbols as possible on average
are required to ensure success of the LT process.
The average degree of the encoding symbols is as
low as possible. The average degree time K is the
number of encoding symbols that ensure complete
recovery of the data.
Soliton wave is one where dispersion balances
refraction perfectly.
Soliton Distribution The basic property required
of a good degree distribution is that input
symbols are added to the ripple at the same rate
as they are processed.

28
Design of Degree Distribution

A few encoding packets must have high degree
(close to K)
Many packets must have low degree so that the
total number of operations is kept small.
Ideally, the received graph should have exactly
one check node of degree 1 at each iteration.
The ideal soliton distribution is as shown below
but this works poorly in practice because of
variance (fluctuations) which means
During decoding, there will be no nodes of degree
1
Some source nodes will receive no connections.
? is the ideal soliton distribution.
Expected degree under this distribution is ln K

29
Robust Soliton Distribution

Adds 2 extra parameters c and d
d is a bound on the probability that the decoding
fails after we receive a certain number of
packets.
C is a constant that can be viewed as a free
parameter.
t is a positive function.
µ is the robust soliton distribution.
Expected degree distribution

30
Tornado Codes versus LT codes

Both RS and Tornado codes are systematic codes
whereas LT codes are not systematic.
Encoder and decoder memory usage in Tornado codes
is much more than with LT codes.
LT codes are rateless i.e. further encoding
packets can be generated on the fly. In contrast,
given k and n, Tornado codes produce only n
encoding packets.
Tornado codes are generated from graphs with
constant maximum degree LT codes use graphs of
logarithmic density.

31
Low Density Parity Check LDPC

Shannon showed the existence of capacity
achieving codes but achieving capacity is only
part f the story. For practical communication, we
need fast encoding and decoding algrithms.
Linear codes obtained from sparse bipartite
graphs. Linear codes associated with sparse
bipartite graphs are called LDPCs.
Any linear code has a representation as a code
associated with a bipartite graph.
LDPC codes are good for Error Correction
Graph gives rise to a linear code of block length
n and dimension k.
LDPC codes are equipped with very fast
(probabilistic) encoding and decoding algorithms.

32
LDPC Structure

The graph gives rise to a linear code of block
length n and dimension n-r.
The codewords are those vectors such that for all
check nodes the sum of the neighboring positions
among the message nodes is 0.
Code Graph Matrix
If the matrix/graph is sparse, the code is called
LDPC.
Sparsity is the key property that allows for the
algorithmic efficiency.

33
More on LDPC Codes

General class of decoding algorithms for LDPC
codes is called messagepassing algorithm.
One important subclass is called the belief
propagation algorithm.
Performance of the regular graphs deteriorates as
the degree of the message nodes increases.
One instance of LDPC codes using a specific
degree distribution is a capacity achieving
sequence called Tornado codes.
Another capacity achieving sequence has also been
developed.

34
Raptor Codes

Extension of LT codes.
First known class of fountain codes with linear
time encoding and decoding.
Raptor codes encode a given message consisting of
a number of symbols, k, into a potentially
limitless sequence of encoding symbols such that
knowledge of any k or more encoding symbols
allows the message to be recovered with some
non-zero probability.

35
Raptor Codes

The basic idea behind Raptor codes is a
pre-coding of the input symbols prior to the
application of an appropriate LT-code.
The probability that the message can be recovered
increases with the number of symbols received
above k becoming very close to one once the
number of received symbols is only very slightly
larger than k.
A symbol can be any size, from a single bit to
hundreds or thousands of bytes.
Raptor codes may be systematic or non-systematic.
Used in
3rd Generation Partnership Project for use in
mobile cellular wireless broadcast and multicast
DVB-H standards for IP datacast to handheld
devices.

36
Comparisons among the Different Codes

Data Length
RS Severely limited in practice.
Tornado Moderate to Large
LT Moderate to Large
Raptor Small to Large
Encoding Length
RS, Tornado Limited to a small multiple of data
length
LT, Raptor Unlimited On-the-fly/ Dynamic
generation
Flexibility to receive from multiple sources
RS, Tornado Hard to coordinate
LT, Raptor No coordination Needed.
Memory Requirements
RS, Tornado Proportional to encoding length
LT, Raptor Proportional to data length

37
Comparisons

Computational Work
RS O(k) per encoding symbol, O(k) to decode
Tornado O(1) per encoding symbol, O(1) to decode
LT O(ln(k)) per encoding symbol, O(kln(k)) to
decode
Raptor O(1) per encoding symbol, O(k) to decode
Reception Overhead
RS None
Tornado, LT, Raptor Small constant fraction
Failure Probability
RS Zero
Tornado Small
LT Tiny
Raptor Really Tiny.

38
Application Scalable Protocol for Distributing
Software

Following properties are desired for scalable and
reliable multicast.
Reliable
The file is guaranteed to be delivered completely
to all receivers.
Efficient
Overhead incurred must be minimal.
User should not be able to distinguish between a
multicast service and a pint-to-point service.
On demand
Clients should be able to initiate the service at
their discretion.
Tolerant
Service must be tolerant of a heterogeneous set
of receivers.
No feedback channel!

39
More Application Domains

DF codes have been applied to reliable multicast
currently.
Robust Distributed Storage
Delivery of streaming content
Delivery of content to mobile clients in wireless
networks
Peer-to-peer applications
Delivery of content along multiple paths for
improved resilience.
Transport layer design for unicast is under
exploration.

40
References

A Digital Fountain Approach to Reliable
Distribution of Bulk Data (1998), John W. Byers,
Michael Luby, Michael Mitzenmacher, Ashutosh
Rege, SIGCOMM 98.
LT Codes (2002), Michael Luby,The 43rd Annual
IEEE Symposium on Foundations of Computer
Science.
David MacKay, Information Theory, Inference
Learning Algorithms.
Bernard Sklar, Digital Communications
Fundamentals and Applications.
E. Berlekamp, R. Peile and S. Pope, The
Application of Error Control to Communications.
1987.
Michael Lubys Usenix presentation.

41
Message Passing

Write a Comment

User Comments (0)

About PowerShow.com

Recent Advances in ErrorErasure Correcting and Coding PowerPoint PPT Presentation