Title: Digitization and Information Theory
1Digitization and Information Theory
2Digitizing information
- Digitizing signals
- Nyquists theorem etc.
- Quantization error
- Quantization
- Channel codes
- Pulse Code Modulation
- Parity and error correction
- Block codes
- Probability and Information theory
3Quantization in time
- Sampling rate or sampling frequency Fs
- Time between samples 1/Fs.
- The continuous, analog signal is converted to a
discrete set of digital samples. Do we lose
information between signals?
4No, no loss if done right!
- Examine the extreme case of very long time
between samples compared to period. - Nyquists theoremsample rate must be twice
highest frequency in the signal
5Sine becomes a square
- Sine wave becomes a square wave when sampled at
Nyquist rate. However, if we filter the square
wave to eliminate all harmonics except the
fundamental we recover the sine.
6Quantization in amplitude
- Number of level is typically quoted in bits, e.g.
8 bits implies 255 (28-1) levels. (do 3 bit
example). - First bit represents negative or positive.
7Signal to Noise ratio and Bits
- If the difference between levels is Q then the
maximum signal amplitude is Q2N-1. - The RMS signal is Q2N-1/v2. (Derive!)
- Now what is the RMS value of the error that
results because the levels are quantized? The
error is uniformly distributed between levels.
8Signal to Noise ratio
- Signal to Noise (Srms/Erms)2.
- Example n8 means S/N98304.
9Quantization leads to distortion
- Low level signals become distorted by
quantizationin an extreme example a sine wave
becomes a square wave. (Low frequency) - Does Nyquist filter get rid of the higher
harmonics in this case? Why or why not?
10Adding Noise lessens distortion!
- Dithering intentionally adding low level noise.
- Nyquist filter takes out high harmonics
associated with spiky transitions. Smoothes the
wave to get closer to original.
11Pulse Width Modulation
12Pulse Width Modulation
13Low pass filter acts as an integrator
- Vout is the average of the Vin signal if the Vin
signal changes much faster then the RC filter can
respond. - E.g. If duty cycle D0.5 then Vout is half way
between ymin and ymax
14Transmitting and storing digital
informationPulse Code Modulation
- Pulse Code Modulation is the most common method
to store and transmit digitized signals. In PCM
the digitized value in each time window is stored
as a binary number. - In PCM the values are listed sequentially, I have
highlighted the 4 bit words in the examplein
practice the receiver knows the number of bits of
digitization - 0010010101100110010100110001101011011101
15Channel Codes transmitting PCM data
- Return To Zero (RTZ) representation of 10101
16Non Return to Zero (NRZ)
- Representation of 101011(red lines indicate time
divisions for each bit)
17Modified Frequency Modulation (MFM)
- Representation of 101011 (Transition means 1)
18Phase Encoding
- Negative going transition 1
- Positive going transition 0
- Representation of 101011
- Self clocking
19Parity and Error correction
- Error correction most transmission methods and
media for storage of information are unreliable.
E.g CD writers make an average of 165 errors per
second. - We insert redundancyextra information, to allow
us to detect and correct errors. - English is redundant
- Thx dgg ape my homtwork
20Parity Bit
- Add an extra bit 0 if there an even number of
1s in a binary word 1 if there is an odd number
of 1s in a binary word. - Word Parity bit
- 1001 0
- 1101 1
- 1111 0
- 0001 1
- After every 4 words add an extra parity word.
21ISBN Numbers on Books
- The last digit of an ISBN number is a type of
parity called a checksum bit. - Modulo 11 parity systems.
- Example0-89006-711-2
- 0x108x99x80x70x66x57x41x31x2207
- The last number is chosen to make the total add
up an integer multiple of 11. - 2x1 added to 20720911x19 (with no remainder).
22Block Codes
- Block codes not only find error but locate and
correct without the need for retransmission. - 1001 0 1001 0 0
- 1101 1 1101 1 1
- 0110 0 0111 1 0
- 0011 0 0011 0 0
- 0001 0000
- 0001
- red-parity green-calculated parity purple-bad
bit
23Information
- How can we quantify information?
- Information content in a message is a measure of
the surprise. Sounds abstract, but surprise is
related to the probability of the message. Highly
unlikely message contains a lot of information
and vice versa. - We must do a quick review of the mathematics of
probability.
24Probability
- Probability is a statistical concept. The
probability of an event is determined by the
result of repeated independent trials.
Probability is the ratio of the number of
outcomes of a particular result divided by the
total number of trials. - Some probabilities are obvious by symmetry E.g.
tossing a coin p(H)0.5, p(T)0.5. - Some require an actual test. E.g. tack tossing.
25Probability
- Probability is a dimensionless number between 0
and 1. - Does probability depend on history? If I flip 25
heads in a row is a tail more likely on the next
toss?
26Probability of independent events
- The probability of two events A and B occurring
one after another is the product of the
probabilities p(AB)p(A)p(B). - Example What is the probability of a couples
first two children both being boys? What are the
odds of 3 boys in a row? - Probability treesa diagram method to plot out
all outcomes along with their probabilities.
Total probability of all outcomes must be 1.
27Probability of dependent events
- Dependent eventsfirst trial affects the
probabilities of the second. - Be careful of dependent eventse.g. taking cards
from a deck Odds of 2 kings dealt as the hidden
cards in a Texas Holdem hand. - Odds of a flush dealt from a complete deck.
28Averages with probability
- The average of some set of quantities whose
probabilities are known is given by the sum of
the probability value product from all possible
values. - Example What is the average value of a single
die thrown many times? - Example Random walk.
29Entropy
- A system tends to move towards its most likely
configuration. This configuration is the most
random. Entropy is a measure of randomness. - Example 1. List all the states of 4 coins. What
mix of heads and tails is most likely? - Example 2. 100 coins on a tray all with heads
facing up. Is this high or low entropy? Now
intermittently whack the tray flipping a few
coins. Which direction does the distribution of
heads and tails go?
30Encoding
- PCM can often be a very inefficient means of
sending information. - The efficiency of information storage or
transmission can be increased by using short
codes for frequently used symbols and longer
codes for less frequently used symbols. - Example consider a data source with 2 symbols A
with probability p(A)0.8 and B with probability
p(B)0.2.
31Compression Example
- ABAAAABAAAAABABAAAAAAAABABAAAA
- 0 1 0000 100000 1 01 0000000001 01 0000 30
digits - 10 0 0110 0 0110110 0 0 0 10 10 0 0 24
digits - 101 0 110 0 11101 0 0 100 101 0 22
digits - Code 1 The obvious code A0 B1.
- Symbol Probability Representation Digits
- A 0.8 0 0.8
- B 0.2 1 0.2
- 1.0
- Conclusion 1 digit per letter.
32Other encoding schemes
- Code 2 Group pairs of letters
- Symbols Prob. Representation digits
- AA 0.64 0 0.64
- AB 0.16 10 0.32
- BA 0.16 110 0.48
- BB 0.04 111 0.12
- 1.56
- 1.56 digits for 2 bits i.e. 0.78 digits per
letter.
33Yet another coding scheme
- Code 3 group in 3s.
- Symbols Prob. Representation digits
- AAA 0.512 0 0.512
- AAB 0.128 100 0.384
- ABA 0.128 101 0.384
- BAA 0.128 110 0.384
- ABB 0.032 11100 0.160
- BAB 0.032 11101 0.160
- BBA 0.032 11110 0.160
- BBB 0.008 11111 0.040
- 2.184
- 2.184 digits for 3 bits, i.e. 0.728 digits per
letter
34Information Theory
- Claude Shannon (1948) quantitative study of
information. - Postulates
- A signal consists of a series of messages each
conveying information from a source. The
information is unknown before its arrival. - Each message need not contain the same amount of
information. - The information content can be measured by the
degree of uncertainty which is removed upon the
arrival of the message.
35Additive not Multiplicative scale
- We want a measure of information that is
additive. As each message of a signal arrives it
should carry a certain amount of information that
adds to the previous information. - Information is related to probabilitybut
probabilities combine multiplicatively. How can
we change x to ? Log scale. - Example Hats 3 sizes, 2 colors.
36Information content of a message
- Information content in a message
- Note log to the base 2. Why? Because information
age is binary i.e. a two level system. - Why the negative?
37How do I find log2(y)
- Remember the definition of a log. In the equation
below x is log2 (y). - Take log10 of both sides
38Info content of a signal
- We defined the information content of a single
message. The info content of a message is the
average information content for a large number of
messages that make up a typical signal. For a
signal with n possible messages the average info
per message is
39Example
- Letters of the alphabet (26 of them). Assume they
occur with equal probability in a message pi1/26 - Average information content per message is
40What does 4.7 mean?
- 4.7 bits per message is the average information
content per message. Compare this number to the
number of bits required to send 26 letters. How
many? 5 bits (2532 so we have a few left over). - Example A gauge has 100 levelshow many bits
are required to encode the information if every
level is equally probable. How about p10.5 and
p2-1001/198?
41Efficiency and Redundancy
- Code efficiency is defined as
- Where I is the average info content and M is the
number of encoding bits. - Code redundancy M-I bits/message
42Huffman Code
- A method to create an efficient code if the
message probabilities are known. - Form the Huffman tree
- list messages in descending order of probability.
- Draw a tree to combine the least likely pairs
of signals first. - Keep grouping until all are paired.
- Start with 1 and 0 at the far end of the tree.
Move back adding 1 and 0 at each junction. - Read from end of tree to each message to get code
43Example
- The signal contains 7 messages with probabilities
as shown below
44What is the number of bits per message using the
Huffman code?
45What is the theoretical minimum number of bits
per message using information theory?
- Iave 0.305log2(0.305) 0.227log2(0.227)
0.161log2(0.161) 0.134 log2(0.134) 0.098
log2(0.098) 0.05log2(0.05) 0.024log2(0.024) - 2.494 bits per message.
46Redundancy and efficiency
- Actual code value I 2.54 bits per message
- Minimum value M2.494 bits per message
- Redundancy M I 2.54-2.494 0.046 bits per
message - Efficiency
47Thats all folks