Noise, Information Theory, and Entropy (cont.) - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Noise, Information Theory, and Entropy (cont.)

Description:

Use probability distribution of symbols (as they appear) to successively narrow original range ... Successively reallocate low-high range based on sequence of ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 39
Provided by: kkar
Category:

less

Transcript and Presenter's Notes

Title: Noise, Information Theory, and Entropy (cont.)


1
Noise, Information Theory, and Entropy (cont.)
  • CS414 Spring 2007
  • By Karrie Karahalios, Roger Cheng, Brian Bailey

2
Coding Intro - revisited
  • Assume alphabet K ofA, B, C, D, E, F, G, H
  • In general, if we want to distinguish n different
    symbols, we will need to use, log2n bits per
    symbol, i.e. 3.
  • Can code alphabet K asA 000 B 001 C 010
    D 011 E 100 F 101 G 110 H 111

3
Coding Intro - revisited
  • BACADAEAFABBAAAGAH is encoded as the string of
    54 bits
  • 00100001000001100010000010100000100100000000011000
    0111 (fixed length code)

4
Coding Intro
  • With this codingA 0 B 100 C 1010 D
    1011E 1100 F 1101 G 1110 H 1111
  • 100010100101101100011010100100000111001111
  • 42 bits, saves more than 20 in space

5
Huffman Tree
A (8), B (3), C(1), D(1), E(1), F(1), G(1), H(1)
6
Limitations
  • Diverges from lower limit when probability of a
    particular symbol becomes high
  • always uses an integral number of bits
  • Must send code book with the data
  • lowers overall efficiency
  • Must determine frequency distribution
  • must remain stable over the data set

7
Arithmetic Coding
  • Replace stream of input symbols with a single
    floating point number
  • bypasses replacement of symbols with codes
  • Use probability distribution of symbols (as they
    appear) to successively narrow original range
  • The longer the sequence, the greater the
    precision of the floating point number
  • requires infinite precision (but this is
    possible)

8
Encoding Example
  • Encode BILL
  • p(B)1/4 p(I)1/4 p(L) 2/4
  • Assign symbols to range 0.0, 1.0 based on p
  • Successively reallocate low-high range based on
    sequence of input symbols

Symbol Low High
B 0 0.25
I 0.25 0.50
L 0.50 1.00
9
Encoding Example
  • When B appears, compute symbol portion 0.0,
    0.25 from current range 0.0,1.0

Symbol Low High
B 0.00 0.25
10
Encoding Example
  • When I appears, compute symbol portion0.25,
    0.50 of current range 0.0, 0.25

Symbol Low High
B 0.00 .25
I 0.0625 0.125
11
Encoding Example
  • When L appears, compute symbol portion0.50,
    1.0 of current range 0.0625, 0.125

Symbol Low High
B 0.00 .25
I 0.0625 0.125
L .09375 .125
12
Encoding Example
  • When L appears, compute symbol portion 0.50,
    1.0 of current range 0.09375, 0.125

Symbol Low High
B 0.00 .25
I 0.0625 0.125
L .09375 .125
L .109375 .125
13
Encoding Example
  • When L appears, compute symbol portion 0.50,
    1.0 of current range 0.09375, 0.125

Symbol Low High
B 0.00 .25
I 0.0625 0.125
L .09375 .125
L .109375 .125
The final low range valueencodes entire
sequence Actually, ANY value within final range
will encode entire sequence
14
Encoding Algorithm
Set low to 0.0 Set high to 1.0 WHILE input
symbols remain Range high low Get
symbol High low high_range(symbol)range L
ow low low_range(symbol)range END
while Output any value in low, high)
15
Decoding Example
E .109375 between 0.0, 0.25 output B E
(.109375 0.0) / 0.25 .4375 .4375 between
0.25, 0.5 output I E (.4375 0.25) / 0.25
0.75 0.75 between 0.5, 1.0 output L E (0.75
0.5) / 0.5 0.5 0.5 between 0.5, 1.0 output
L E (0.5 0.5) / 0.5 0.0 -gt STOP
Symbol Low High
B 0 0.25
I 0.25 0.50
L 0.50 1.00
16
Decoding Algorithm
encoded Get (encoded number) DO Find symbol
whose range contains encoded Output the
symbol range high(symbol)
low(symbol) encoded (encoded low(symbol)) /
range UNTIL (EOF)
17
Code Transmission
  • Transmit any number within final range
  • choose number that requires fewest bits
  • Recall that the minimum number of bits required
    to represent an ensemble is
  • Note that we are not comparing directly to H
    because no code book is generated

18
Compute Size of Interval
  • Interval L, L S
  • Size of interval (S)
  • For ensemble BILL
  • .25.25.5.5 .015625
  • Check algorithm result
  • .125 - .109375 .015625

Symbol Low High
B 0 0.25
I 0.25 0.50
L 0.50 1.00
19
Number of Bits to Represent S
  • Requires bits (min) to specify S
  • where
  • Same as the minimum number of bits

20
Determine Representation
  • Compute midpoint L S/2
  • truncate its binary representation after
  • Truncated number lies within L, LS, as

21
Practical Notes
  • Achieve infinite precision using fixed width
    integers as shift registers
  • represent only fractional part of each range
  • as precision of each range increases, the most
    significant bits will match
  • shift out MSB and continue algorithm
  • Caveat
  • underflow can occur if ranges approach same
    number without MSB being equal

22
Exercise Huffman vs Arithmetic
  • Given message AAAAB where p(A).9 p(B).1
  • Huffman code
  • (a) compute entropy (H)
  • (b) build Huffman tree (simple)
  • (c) compute average codeword length
  • (d) compute number of bits needed to encode
    message
  • Arithmetic coding
  • (a) compute theoretical min. number of bits to
    transmit message
  • (b) compute the final value that represents the
    message
  • (c) independent of (b), what is the min number of
    bits needed to represent the final interval? How
    does this value compare to (a)?How does this
    value compare to Huffman part (d)

23
Error detection and correction
  • Error detection is the ability to detect errors
    that are made due to noise or other impairments
    during transmission from the transmitter to the
    receiver.
  • Error correction has the additional feature that
    enables localization of the errors and correcting
    them.
  • Error detection always precedes error
    correction.

24
Error Detection
  • Data transmission can contain errors
  • Single-bit
  • Burst errors of length n where n is the distance
    between the first and last errors in data block.
  • How to detect errors
  • If only data is transmitted, errors cannot be
    detected
  • Send more information with data that satisfies a
    special relationship
  • Add redundancy

25
Error Detection Methods
  • Vertical Redundancy Check (VRC) / Parity Check
  • Longitudinal Redundancy Check (LRC)
  • Checksum
  • Cyclic Redundancy Check

26
Vertical Redundancy Check (VRC)aka Parity Check
  • Vertical Redundancy Check (VRC)
  • Append a single bit at the end of data block such
    that the number of ones is even? Even Parity
    (odd parity is similar)0110011 ?
    011001100110001 ? 01100011? Odd Parity 0110011
    ? 011001110110001 ? 01100010
  • Performance
  • Detects all odd-number errors in a data block
    (even)

27
Longitudinal Redundancy Check (LRC)
  • Longitudinal Redundancy Check (LRC)
  • Organize data into a table and create a parity
    for each column

28
LRC
  • Performance
  • Detects all burst errors up to length n (number
    of columns)
  • Misses burst errors of length n1 if there are
    n-1 uninverted bits between the first and last
    bit

29
Parallel Parity
  • One error gives 2 parity errors. Can detect which
    value is flipped.

30
Checksum
  • Used by upper layer protocols
  • Similar to LRC, uses ones complement arithmetic
  • Ex.
  • 2 40 05 80 FB 12 00 26 B4 BB 09 B4 12 28 74 11 BB
  • 12 00 2E 22 12 00 26 75 00 00 FA 12 00 26 25 00
    3A
  • F5 00 DA F7 12 00 26 B5 00 06 74 10 12 00 2E 22
    F1
  • 74 11 12 00 2E 22 74 13 12 00 2E 22 B4

31
Cyclic Redundancy Check
  • Powerful error detection scheme
  • Rather than addition, binary division is used ?
    Finite Algebra Theory (Galois Fields)
  • Can be easily implemented with small amount of
    hardware
  • Shift registers
  • XOR (for addition and subtraction)

32
CRC
  • Let us assume k message bits and n bits of
    redundancy
  • Associate bits with coefficients of a
    polynomial1 0 1 1 0 1
    11x60x51x41x30x21x1 x6x4x3x1

33
CRC
  • Let M(x) be the message polynomial
  • Let P(x) be the generator polynomial
  • P(x) is fixed for a given CRC scheme
  • P(x) is known both by sender and receiver
  • Create a block polynomial F(x) based on M(x) and
    P(x) such that F(x) is divisible by P(x)

34
CRC
  • Sending
  • Multiply M(x) by xn
  • Divide xnM(x) by P(x)
  • Ignore the quotient and keep the reminder C(x)
  • Form and send F(x) xnM(x)C(x)
  • Receiving
  • Receive F(x)
  • Divide F(x) by P(x)
  • Accept if remainder is 0, reject otherwise

35
Properties of CRC
  • Sent F(x), but received F(x) F(x)E(x)When
    will E(x)/P(x) have no remainder,i.e., when does
    CRC fail to catch an error?
  • Single Bit Error ? E(x) xiIf P(x) has two or
    more terms, P(x) will not divide E(x)
  • 2 Isolated Single Bit Errors (double errors)E(x)
    xixj, igtjE(x) xj(xi-j1)Provided that P(x)
    is not divisible by x, a sufficient condition to
    detect all double errors is that P(x) does not
    divide (xt1) for any t up to i-j (i.e., block
    length)

36
Properties of CRC
  • Odd Number of Bit ErrorsIf x1 is a factor of
    P(x), all odd number of bit errors are
    detectedProof Assume an odd number of errors
    has x1 as a factor.Then E(x) (x1)T(x).
    Evaluate E(x) for x 1? E(x) E(1) 1 since
    there are odd number of terms (x1) (11)
    0 (x1)T(x) (11)T(1) 0? E(x) ?
    (x1)T(x)

37
Properties of CRC
  • Short Burst Errors (Length t n, number of
    redundant bits)E(x) xj(xt-11) ? Length t,
    starting at bit position jIf P(x) has an x0 term
    and t n, P(x) will not divide E(x) ?All errors
    up to length n are detected
  • Long Burst Errors (Length t n1)Undetectable
    only if burst error is the same as P(x)P(x)
    xn 1 n-1 bits between xn and x0 E(x) 1
    1 must matchProbability of not detecting the
    error is 2-(n-1)
  • Longer Burst Errors (Length t gt n1)Probability
    of not detecting the error is 2-n

38
Error Correction
  • Hamming Codes(more next week)
Write a Comment
User Comments (0)
About PowerShow.com