Title: Physics of information
1Physics of information
- Communication in the presence of noise
- C.E. Shannon, Proc. Inst. Radio Eng. (1949)
- Some informational aspects of visual
perception, F. Attneave, Psych. Rev. (1954)
Ori Katz ori.katz_at_weizmann.ac.il
2Talk overview
- Information capacity of a physical channel
- Redundancy, entropy and compression
- Connection to biological systems
Emphasis concepts, intuitions, and examples
3A little background
- An extension of A mathematical theory of
communications, (1948). - The basis for information theory field (first use
in print of bit) - Shannon worked for Bell-labs at the time.
- His Ph.D thesis An algebra for theoretical
genetics, was never published - Built the first juggling machine (W.C.Fields),
and a mechanical-mouse with learning capabilities
(Theseus)
Theseus
W.C. Fields
4A general communication system
- Shannons route for this abstract problem
- Encoder codes each message ? continuous waveform
s(t) - Sampling theorem s(t) represented by finite
number of samples - Geometric representation samples ? a point in
Euclidean space. - Analyze the addition of noise (physical channel)
- ? a limit on reliable
transmission rate
5The (Nyquist/Shannon) sampling theorem
- Transmitted waveform a continuous function in
time s(t), bandwidth (W) limited by the physical
channel S(fgtW)0 - sample its values at discrete times ?t1/fs
(fs sampling frequency) - s(t) can be represented exactly by the discrete
samples Vn as long as - fs ? 2W (Nyquist sampling rate)
- Result waveform of duration T, is represented by
2WT numbers - a vector in 2WT-dimensions space
- Vs(1/2W), s(2/2W), , s(2WT/2W)
Fourier (freq.) domain S(fgtW)0
6An example for Nyquist rate a music CD
- Audible human-ear frequency range 20Hz - 20KHz
- The Nyquist rate is therefore 2 x 20KHz 40KHz
- CD sampling rate 44.1KHz, fulfilling Nyquist
rate.
- Anecdotes
- Exact rate was inherited from late 70s
magnetic-tape storage conversion devices. - Long debate between Philips (44,056 samples/sec)
and Sony (44,100 samples/sec)...
7The geometric representation
- Each continuous signal s(t) of duration T and
bandwidth W, mapped to - ? a point in 2WT-dimension space (coordinates
sampled amplitudes) - V x1,x2,, x2WT s(1/2W), , s(2WT/2W)
- In our example
- A 1 hour CD recording ? a single point in a
space having - 44,100 x 60sec x 60min 158.8x106 dimensions
(!!) - The norm (distance2) in this space is measures
signal power / total energy ? An Euclidean space
metric
8Addition of noise in the channel
- Example in a 3-dimensional space (first 3 samples
in the CD) - V x1,x2,, x2WT s(?t), s(2?t), , s(T)
x3
mapping
x1
x2
- Addition of white Gaussian (thermal) noise with
an average power N smears each point into a
sphere cloud with a radii ??N - For large T, noise power ? N (statistical
average) - ? Received point, located on sphere shell
distance noise ? N - ? clouded sphere of uncertainty becomes rigid
VSN s(?t)n(?t), s(2?t)n(2?t), , s(T)n(T)
9The number of distinguishable messages
- Reliable transmission receiver must distinguish
between any two different messages, under the
given noise conditions
x3
?N
?P
?P
x1
x2
- Max number of distinguishable messages (M) ? the
sphere-packing problem in 2TW dimensions
- Longer mapped message, rigid-er spheres
- ? probability to err is as small as one
wants (reliable transmission)
10The channel capacity
- Number of distinguishable messages (coded as
signals of length T) - Number of different distinguishable bits
-
- The reliably transmittable bit-rate (bits per
unit time)
(in bits/second)
The celebrated channel capacity theorem by
Shannon. - Also proved that C can be reached
11Gaussian white noise Thermal noise?
- With no signal, the receiver measures a
fluctuating noise - In our example pressure fluctuations of air
molecules impinging on the microphone (thermal
energy)
- The statistics of thermal noise is Gaussian
Ps(t)v ? exp(-(m/2KT)v2) - The power spectral-density is constant
(power-spectrum S(f)2const)
white
pink/brown
12Some examples for physical channels
- Channel capacity limit
- 1) Speech (e.g. this lecture)
- W20KHz, P/N1 - 100 ? C ? 20,000bps
130,000bps - Actual bit-rate (2 words/sec) x (5
letters/word) x (5 bits/letter) 50 bps - 2) Visual sensory channel
- (Images/sec) x
(receptors/image) x (Two eyes) - Bandwidth (W) 25 x
50x106 x 2
2.55x109 Hz - P/N gt 256
- ? C ? 2.5x109 x log2(256) 20x109 bps
- A two-hour movie
- ? 2hours x 60min x 60 sec x 20Gbps
1.4x1014bits 15,000 Gbytes (DVD 4.7Gbyte)
- Were not using the channel capacity ? redundant
information - Simplify processing by compressing signal
- Extracting only the essential information (what
is essential?!)
(in bits/second)
13Redundant information demonstration (using Matlab)
Original sample 44.1Ks/s x 16bit/s 705Kbps (CD
quality)
14With only 4bit per sample
44.1Ks/s x 4bit/s 176.4Kbps
15With only 3bit per sample
44.1Ks/s x 3bit/s 132.3Kbps
16With only 2bit per sample
44.1Ks/s x 2bit/s 88.2Kbps
17With only 1bit per sample (!)
44.1Ks/s x 1bit/s 44.1Kbps
Sounds not-too-good, but the essence is
there Main reason not all of phase-space is
accessible by mouth/ear Another example (smart)
high-compression mp3 algorithm _at_16Kbps
18Visual redundancy / compression
- Images Redundancies in Attneaves paper ? image
compression formats - a
bottle on a table - (1954)
-
80x50
pixels
- edges
- short-range similarities
- patterns
- repetitions
- symmetries
- repetitions
- etc, etc.
What information is essential?? (evolution?)
(2008) 400x600 704Kbyte .bmp 30.6Kbyte
.jpg 10.9Kbyte .jpg 8Kbyte .jpg 6.3Kbyte
.jpg 5Kbyte .jpg 4Kbyte .jpg
- Movies the same consecutive images are
similar - Text future language lesson (Lilach David)
19How much can we compress?
- How many bits are needed to code a message?
- Intuitively bits log2M (M
- possible messages) - Regularities/Lawfulness ? smaller M
- some messages more probable ? can do better than
log2M - Can code a message with (without loss of
information) -
- Intuition
- Can use shorter bit-strings for probable
messages.
20 lossless-compression example (entropy code)
- Example M4 possible messages (e.g. tones)
- A (94), B (2), C (2), D (2)
- 1) Without compression 2 bits/message
- A?00, B?01, C?10, D?11.
- 2) A better code
- A?0, B?10 , C?110, D?111
- ltbits/messagegt 0.94x1 0.02x2 2x (0.02x3)
1.1 bits/msg
21Why entropy?
- The only measure that fulfills 4 physical
requirements - H0 if P(Mi)1.
- A message with P(Mi)0 does not contribute
- Maximum entropy for equally distributed messages
- Addition of two independent messages-spaces
- Hxy HxHy
22The speech Vocoder (VOice-CODer)
Model the vocal-tract with a small number of
parameters. Lawfulness of speech subspace only ?
fails for musical input Used by Skype /
Google-talk / GSM (8-15KBps)
The ancestor of modern speech CODECs
(COder-DECoders) The Human organ
23Intuition for sum(plog2p)
- An (almost) general example
- Suppose that the message (a song) is composed of
a series of n symbols (tones) - Each symbol is one of K possible symbols (tones)
- Each symbol i appears at a probability Pi
(averaged over all possible communications)
i.e. some symbols are more in use than others - How many different messages are possible with the
given distribution Pi? - How many bits do we need to encode all of the
possible messages?
24Link to biological systems
- Information is conveyed via. a physical channel
- Cell to cell , DNA to cell, Cell to its
descendant , Neurons/nerve system - The physical channel
- concentrations of molecules (mRNA, ions.) as a
function of space and time. - Bandwidth limit
- parameters cannot change at an infinite rate
(diffusion, chemical reaction timescales) - Signal to noise
- Thermal fluctuations, environment
- Major difference not 100 reliable transmission
- ? Model an overlap of non-rigid uncertainty
clouds. - Use channel-capacity theorem at your own risk...
25Summary
- Physical channel Capacity theorem
- SNR, bandwidth
- Geometrical representation
- Entropy as a measure of redundancy
- Link to biological systems