Recent Advances in ErrorErasure Correcting and Coding PowerPoint PPT Presentation

presentation player overlay
1 / 41
About This Presentation
Transcript and Presenter's Notes

Title: Recent Advances in ErrorErasure Correcting and Coding


1
Recent Advances in Error/Erasure Correcting and
Coding
  • Vijay Subramanian

2
Overview of Presentation
  • Background on Error Correcting Codes.
  • Some Powerful Codes
  • Reed-Solomon Codes
  • Low-Density Parity Check (LDPC) Codes
    (Rediscovered)
  • Tornado Codes
  • Digital Fountain Approach
  • Fountain Codes
  • LT Codes
  • Raptor Codes
  • Comparisons of Different Codes
  • Applications

3
Block Codes
  • Accepts a k-bit information symbol and transforms
    it into a n-bit symbol.
  • It is a fixed length code
  • Must be implemented using combinatorial logic
    circuit.

4
Some Important Types of Block Codes
  • Linear Block Codes
  • Sum of any 2 code words results in a third unique
    codeword.
  • Systematic Code
  • The data bits also are present in the generated
    codeword.
  • BCH
  • Generalization of Hamming code for multiple error
    correction.
  • Very special class of linear codes known as Goppa
    codes.
  • Cyclic Codes
  • Important subclass of linear block codes where
    encoding and decoding can be implemented easily.
  • Cyclic shift of a code word yields another code
    word.
  • Group Codes
  • Same as linear block codes. Also known as
    generalized parity check codes.

5
Convolution Codes
  • Generates n-bit message block from a k-bit block.
  • Different from block codes in that encoder has
    memory order m or constraint length.
  • Must be implemented using sequential logic
    circuit.
  • Decoders have high complexity (decoding
    operations) per output bit.
  • More suited for low-speed digitized voice traffic
    (lower integrity) than for high-speed data
    needing high integrity.
  • If the decoder loses or makes a mistake, errors
    will propagate.
  • Performance depends on the constraint length.

6
Concatenated Codes
  • A concatenated code uses two levels of coding
  • An inner code and an outer code to obtain desired
    performance.
  • Inner code is designed to correct most of the
    channel errors.
  • The outer code is a higher rate (lower
    redundancy) code which reduces the probability of
    error to a specified level.
  • Concatenated codes are effective against a
    mixture of random errors and bursts.

7
Concatenated Codes
  • Having 2 steps reduces the complexity required to
    obtain desired error rate (compared to a single
    code)
  • Outer code is typically RS code.
  • Inner decoder uses all the available soft
    decision data to provide the best performance
    under Gaussian conditions.
  • For a one-level concatenated code, if the minimum
    distances of the inner and outer codes are d1 and
    d2, then the minimum distance of the
    concatenation is at least d1d2.
  • Choice of the inner code is dictated by the
    application
  • For high data rates, inner code should be a block
    code
  • For predominantly slower Gaussian channels, the
    inner code should be a convolution code.

8
Turbo Codes
  • Important class of convolution codes proposed in
    1993
  • Achieves large coding gains by combining two or
    more relatively simple building blocks (component
    codes).
  • A serial concatenation of codes is often used for
    power limited systems.
  • A popular approach is to have a RS outer code
    (applied first, removed last) with an inner code
    (applied last, removed first).
  • Of all practical error correction methods known
    to date, turbo codes, together with LDPC codes,
    come closest to approaching the Shannon limit,
    the theoretical limit of maximum information
    transfer rate over a noisy channel.
  • Turbo codes make it possible to increase
    available bandwidth without increasing the power
    of a transmission, or they can be used to
    decrease the amount of power used to transmit at
    a certain data rate.
  • Main drawbacks are the relative high decoding
    complexity and a relatively high latency, which
    makes it unsuitable for some applications

9
Block Codes versus Convoluted Codes
  • For convolution codes, decoder complexity
    increases as the redundancy decreases whereas for
    block codes, complexity decreases as the
    redundancy decreases. For high code rates, this
    favors block codes.
  • Practically, the speed of convolution codes is
    limited compared to block codes. On the other
    hand, fast and powerful RS block decoders have
    been built to operate at rates above 120Mb/s.
  • Convolution Codes are weak while RS codes are
    superior when it comes to burst errors.

10
RS-code versus Convolution Codes
  • Which is better?
  • For digitized voice, concatenated code is better.
  • For high speed/high integrity, RS/block codes are
    better.

11
Erasure Correcting Approaches
  • Gallager Codes 1962, Rediscovered as LDPC
  • RS Codes 1960, Reed and Solomon
  • Complexity in Theory and encoding/decoding
    algorithms (Finite Field Operations)
  • Tornado Codes 1997, Luby et al.
  • Naïve and simple linear equation
    encoding/decoding algorithm.
  • Complexity and theory in design and analysis of
    irregular graph structure.
  • LT Codes 1998, Luby
  • Simpler scalable irregular graph structure.
  • Simpler analysis and design
  • Independent generation of each encoding symbol
  • Unlimited number of encoding symbols i.e.
    rateless codes
  • First practical realization of a digital
    fountain.
  • Raptor Codes 2001, Shokrollahi
  • Even better practical realization of a DF.

12
Reed-Solomon Codes
  • Reed-Solomon codes are special and widely
    implemented because they are "almost perfect" in
    the sense that the extra (redundant) letters
    added on by the encoder is at a minimum for any
    level of error correction, so that no bits are
    wasted.
  • RS codes are easier to decode than most other
    non-binary codes.
  • RS codes allow the correction of erasures. RS
    codes have a combined error and erasure
    correction capacity of 2ts lt n-k
  • RS codes can correct twice as many erasures as
    errors (where the decoder has no information
    about the error).
  • RS codes exist on b bit symbols for every value
    of b.

13
RS Codes
  • Nonbinary cyclic codes with m-bit sequences where
    m gt 2.
  • RS(n,k) codes n m bit symbols exist for all n and
    k for which 0 lt k lt n lt 2m 2
  • RS codes achieve the largest possible code
    minimum distance for any linear code with the
    same encoder input and output block lengths.
  • RS codes are particularly useful for burst-error
    correction (for channels that have no memory.)
  • Any linear code is capable of correcting n-k
    symbol erasure patterns is the n-k erased symbols
    all happen to lie on the parity symbols.
  • However, RS codes have the remarkable property
    that they can correct any set of n-k symbol
    erasures in the block.
  • RS codes can be designed to have any redundancy.

14
Properties of RS codes
  • Special case of BCH codes.
  • Excellent erasure correcting property.
  • They are popular because they can combat
    combinations of both random and burst errors.
  • They also have a long block length, assuring a
    sharp performance curve.
  • RS codes exhibit a very sharp improvement of
    block error-rate with an improvement of channel
    quality making their use ideal for data.
  • Since they cannot exploit soft decision data
    well, they are used in concatenated codes.
  • Real number arithmetic is avoided since they
    operate directly on bits.

15
RS code Performance (Error Correction No
Erasures)
16
Tornado Codes
  • Class of erasure codes that support
    error-correcting and have fast encoding and
    decoding algorithms.
  • Software-based implementations of Tornado codes
    are about 100 times faster on small lengths and
    about 10,000 times faster on larger lengths than
    software-based Reed-Solomon erasure codes while
    having only slightly worse overhead.
  • Tornado codes are fixed rate, near optimal
    erasure correcting codes
  • Use sparse bipartite graphs to trade encoding and
    decoding speed for reception overhead.
  • Since the introduction of Tornado codes, many
    other similar codes have emerged, most notably
    Online codes, LT codes and Raptor codes. The are
    based on sparse graphs.
  • Tornado codes are block erasure codes that have
    linear in n encoding and decoding times.
  • They are not rateless.
  • Running times for encoding and decoding
    algorithms are proportional to their block
    length. This makes it slow for small rates.

17
Structure of Tornado Codes
  • Each column acts as redundancy for the column on
    its left.
  • Redundant packets are generated by taking the
    EXOR of the packets to the left.
  • Tornado codes have the erasure property that to
    recover k source packets we need slightly more
    than k encoding packets.

18
Tornado Codes versus RS Codes
  • Advantage of Tornado codes over standard codes is
    that they trade off a slight increase in
    reception overhead for decreased encoding and
    decoding times.
  • The algorithm runs in real time and file is
    reconstructed as soon as enough packets arrive.
  • Source and Sink need to agree on the graph a
    priori.

19
Digital Fountain
  • Digital Fountain
  • Key property of a digital fountain is that the
    source data can be reconstructed from any subset
    of the encoding packets equal in total length to
    the source data.
  • The client can reconstruct the data using any k
    of the encoded packets.
  • Ideally, the encoding and decoding processes
    require very little processing overhead.

20
Digital Fountain
  • A DF is conceptually simpler, more efficient and
    applicable to a broader class of networks than
    previous approaches.
  • The key property of a DF is that the source data
    can be reconstructed intact from any subset of
    the encoding packets equal in length to the
    source data.
  • RS codes had unacceptably high running times
    which led to the development of Tornado codes.
  • DF codes are record-breaking sparse graph codes
    for channels with erasures.
  • Retransmissions are needless as per Shannons
    theory.
  • RS codes work only for small k,n and failure
    probability need to be determined beforehand.

21
Properties of DF Codes
  • Required Encoding only as long as data.
  • Rateless implies that number of encoding packets
    that can be generated is potentially limitless.
    Can produce an unlimited flow of encoding
  • Data is always recoverable from required
    encoding.
  • Low complexity for encoding and decoding.
  • Can encode large amount of data.
  • Ideal under conditions
  • High/ unknown / Variable loss
  • Large RTT
  • Intermittent Connectivity
  • One way channels
  • Can be used as background data transport

22
Properties of DF Codes
  • Every packet is useful.
  • Can serve a large number of clients without need
    to maintain state.
  • One user with poor connection cannot affect
    others.
  • Advantages are pronounced when we apply this to
    broadcast/multicast.

23
DF Encoder
  • This encoding defines a graph connecting source
    packets and encoding packets.
  • If the mean degree d is smaller than K then the
    graph is sparse.
  • The decoder needs to know the degree of each
    received packet and the source packets it is
    connected to.

24
LT-code Decoding Process
  • Initially, all the input symbols are uncovered.
  • Step 1 All encoding symbols with 1 neighbor are
    released to cover their unique neighbor.
  • The set of covered input symbols not yet
    processed is called the ripple.
  • In each step, 1 input symbol from the ripple is
    processed.
  • The process ends when the ripple is empty.
  • The process fails if there is at least one
    uncovered input symbol at the end.
  • Process succeeds if all the input symbols are
    covered at the end.
  • Goal of the degree distribution design is to
    slowly release encoding symbols while keeping the
    ripple small while ensuring that it does not
    disappear.

25
DF Decoder
26
LT Codes
  • Rateless erasure codes
  • LT Codes are universal in the sense that they
  • Are near optimal for every erasure channel
  • And are very efficient as the data length grows.
  • Close to the minimum number of encoded symbols
    can be generated and sent to the receivers
  • Number of packets receivers need to generate the
    original data is also close to the minimum.

27
LT Codes
  • Objectives
  • As few encoding symbols as possible on average
    are required to ensure success of the LT process.
  • The average degree of the encoding symbols is as
    low as possible. The average degree time K is the
    number of encoding symbols that ensure complete
    recovery of the data.
  • Soliton wave is one where dispersion balances
    refraction perfectly.
  • Soliton Distribution The basic property required
    of a good degree distribution is that input
    symbols are added to the ripple at the same rate
    as they are processed.

28
Design of Degree Distribution
  • A few encoding packets must have high degree
    (close to K)
  • Many packets must have low degree so that the
    total number of operations is kept small.
  • Ideally, the received graph should have exactly
    one check node of degree 1 at each iteration.
  • The ideal soliton distribution is as shown below
    but this works poorly in practice because of
    variance (fluctuations) which means
  • During decoding, there will be no nodes of degree
    1
  • Some source nodes will receive no connections.
  • ? is the ideal soliton distribution.
  • Expected degree under this distribution is ln K

29
Robust Soliton Distribution
  • Adds 2 extra parameters c and d
  • d is a bound on the probability that the decoding
    fails after we receive a certain number of
    packets.
  • C is a constant that can be viewed as a free
    parameter.
  • t is a positive function.
  • µ is the robust soliton distribution.
  • Expected degree distribution

30
Tornado Codes versus LT codes
  • Both RS and Tornado codes are systematic codes
    whereas LT codes are not systematic.
  • Encoder and decoder memory usage in Tornado codes
    is much more than with LT codes.
  • LT codes are rateless i.e. further encoding
    packets can be generated on the fly. In contrast,
    given k and n, Tornado codes produce only n
    encoding packets.
  • Tornado codes are generated from graphs with
    constant maximum degree LT codes use graphs of
    logarithmic density.

31
Low Density Parity Check LDPC
  • Shannon showed the existence of capacity
    achieving codes but achieving capacity is only
    part f the story. For practical communication, we
    need fast encoding and decoding algrithms.
  • Linear codes obtained from sparse bipartite
    graphs. Linear codes associated with sparse
    bipartite graphs are called LDPCs.
  • Any linear code has a representation as a code
    associated with a bipartite graph.
  • LDPC codes are good for Error Correction
  • Graph gives rise to a linear code of block length
    n and dimension k.
  • LDPC codes are equipped with very fast
    (probabilistic) encoding and decoding algorithms.

32
LDPC Structure
  • The graph gives rise to a linear code of block
    length n and dimension n-r.
  • The codewords are those vectors such that for all
    check nodes the sum of the neighboring positions
    among the message nodes is 0.
  • Code Graph Matrix
  • If the matrix/graph is sparse, the code is called
    LDPC.
  • Sparsity is the key property that allows for the
    algorithmic efficiency.

33
More on LDPC Codes
  • General class of decoding algorithms for LDPC
    codes is called messagepassing algorithm.
  • One important subclass is called the belief
    propagation algorithm.
  • Performance of the regular graphs deteriorates as
    the degree of the message nodes increases.
  • One instance of LDPC codes using a specific
    degree distribution is a capacity achieving
    sequence called Tornado codes.
  • Another capacity achieving sequence has also been
    developed.

34
Raptor Codes
  • Extension of LT codes.
  • First known class of fountain codes with linear
    time encoding and decoding.
  • Raptor codes encode a given message consisting of
    a number of symbols, k, into a potentially
    limitless sequence of encoding symbols such that
    knowledge of any k or more encoding symbols
    allows the message to be recovered with some
    non-zero probability.

35
Raptor Codes
  • The basic idea behind Raptor codes is a
    pre-coding of the input symbols prior to the
    application of an appropriate LT-code.
  • The probability that the message can be recovered
    increases with the number of symbols received
    above k becoming very close to one once the
    number of received symbols is only very slightly
    larger than k.
  • A symbol can be any size, from a single bit to
    hundreds or thousands of bytes.
  • Raptor codes may be systematic or non-systematic.
  • Used in
  • 3rd Generation Partnership Project for use in
    mobile cellular wireless broadcast and multicast
  • DVB-H standards for IP datacast to handheld
    devices.

36
Comparisons among the Different Codes
  • Data Length
  • RS Severely limited in practice.
  • Tornado Moderate to Large
  • LT Moderate to Large
  • Raptor Small to Large
  • Encoding Length
  • RS, Tornado Limited to a small multiple of data
    length
  • LT, Raptor Unlimited On-the-fly/ Dynamic
    generation
  • Flexibility to receive from multiple sources
  • RS, Tornado Hard to coordinate
  • LT, Raptor No coordination Needed.
  • Memory Requirements
  • RS, Tornado Proportional to encoding length
  • LT, Raptor Proportional to data length

37
Comparisons
  • Computational Work
  • RS O(k) per encoding symbol, O(k) to decode
  • Tornado O(1) per encoding symbol, O(1) to decode
  • LT O(ln(k)) per encoding symbol, O(kln(k)) to
    decode
  • Raptor O(1) per encoding symbol, O(k) to decode
  • Reception Overhead
  • RS None
  • Tornado, LT, Raptor Small constant fraction
  • Failure Probability
  • RS Zero
  • Tornado Small
  • LT Tiny
  • Raptor Really Tiny.

38
Application Scalable Protocol for Distributing
Software
  • Following properties are desired for scalable and
    reliable multicast.
  • Reliable
  • The file is guaranteed to be delivered completely
    to all receivers.
  • Efficient
  • Overhead incurred must be minimal.
  • User should not be able to distinguish between a
    multicast service and a pint-to-point service.
  • On demand
  • Clients should be able to initiate the service at
    their discretion.
  • Tolerant
  • Service must be tolerant of a heterogeneous set
    of receivers.
  • No feedback channel!

39
More Application Domains
  • DF codes have been applied to reliable multicast
    currently.
  • Robust Distributed Storage
  • Delivery of streaming content
  • Delivery of content to mobile clients in wireless
    networks
  • Peer-to-peer applications
  • Delivery of content along multiple paths for
    improved resilience.
  • Transport layer design for unicast is under
    exploration.

40
References
  • A Digital Fountain Approach to Reliable
    Distribution of Bulk Data (1998), John W. Byers,
    Michael Luby, Michael Mitzenmacher, Ashutosh
    Rege, SIGCOMM 98.
  • LT Codes (2002),  Michael Luby,The 43rd Annual
    IEEE Symposium on Foundations of Computer
    Science.
  • David MacKay, Information Theory, Inference
    Learning Algorithms.
  • Bernard Sklar, Digital Communications
    Fundamentals and Applications.
  • E. Berlekamp, R. Peile and S. Pope, The
    Application of Error Control to Communications.
    1987.
  • Michael Lubys Usenix presentation.

41
Message Passing
Write a Comment
User Comments (0)
About PowerShow.com