Title: Some recent results in mathematics related to data transmission:
1Some recent results in mathematics related to
data transmission
- Michel Waldschmidt
- Université P. et M. Curie - Paris VI
- Centre International de Mathématiques Pures et
Appliquées - CIMPA
India, October-November 2007
http//www.math.jussieu.fr/miw/
2French Science Today
India October- November 2007
Some recent results in mathematics related to
data transmission
Starting with card tricks, we show how
mathematical tools are used to detect and to
correct errors occuring in the transmission of
data. These so-called "error-detecting codes"
and "error-correcting codes" enable
identification and correction of the errors
caused by noise or other impairments during
transmission from the transmitter to the
receiver. They are used in compact disks to
correct errors caused by scratches, in satellite
broadcasting, in digital money transfers, in
telephone connexions, they are useful for
improving the reliability of data storage media
as well as to correct errors cause when a hard
drive fails. The National Aeronautics and Space
Administration (NASA) has used many different
error-correcting codes for deep-space
telecommunications and orbital missions.
http//www.math.jussieu.fr/miw/
3French Science Today
India November 2007
Some recent results in mathematics related to
data transmission
Most of the theory arises from earlier
developments of mathematics which were far
removed from any concrete application. One of the
main tools is the theory of finite fields, which
was invented by Galois in the XIXth century, for
solving polynomial equations by means of
radicals. The first error-correcting code
happened to occur in a sport newspaper in Finland
in 1930. The mathematical theory of information
was created half a century ago by Claude Shannon.
The mathematics behind these technical devices
are being developped in a number of scientific
centers all around the world, including in India
and in France.
http//www.math.jussieu.fr/miw/
4French Science Today
Mathematical aspects of Coding Theory in France
The main teams in the domain are gathered in the
group C2 ''Coding Theory and Cryptography''
, which belongs to a more general group (GDR)
''Mathematical Informatics''.
http//www.math.jussieu.fr/miw/
5French Science Today
The most important are INRIA
Rocquencourt Université de Bordeaux ENST Télécom
Bretagne Université de Limoges Université de
Marseille Université de Toulon Université de
Toulouse
http//www.math.jussieu.fr/miw/
6INRIA
Brest
Limoges
Bordeaux
Marseille
Toulon
Toulouse
7Institut National de Recherche en Informatique et
en Automatique
http//www-rocq.inria.fr/codes/
National Research Institute in Computer Science
and Automatic
8Institut de Mathématiques de Bordeaux
http//www.math.u-bordeaux1.fr/maths/
Lattices and combinatorics
9École Nationale Supérieure des Télécommunications
de Bretagne
http//departements.enst-bretagne.fr/sc/recherche/
turbo/
Turbocodes
10Research Laboratory of LIMOGES
http//www.xlim.fr/
11Marseille Institut de Mathématiques de Luminy
Arithmetic and Information Theory Algebraic
geometry over finite fields
12http//grim.univ-tln.fr/
Université du Sud Toulon-Var
Boolean functions
13Université de Toulouse Le Mirail
Algebraic geometry over finite fields
http//www.univ-tlse2.fr/grimm/algo
14GDR IMGroupe de Recherche Informatique
Mathématique
http//www.gdr-im.fr/
- The GDR ''Mathematical Informatics'' gathers all
the french teams which work on computer science
problems with mathematical methods.
15Some instances of scientific domains of the GDR
IM
http//www.gdr-im.fr/
- Calcul Formel (Symbolic computation)
- ARITH Arithmétique (Arithmetics)
- COMBALG Combinatoire algébrique (Algebraic
Combinatorics)
16French Science Today
Mathematical Aspects of Coding Theory in
India Indian Institute of Technology
Bombay Indian Institute of Science
Bangalore Indian Institute of Technology
Kanpur Panjab University Chandigarh University
of Delhi Delhi
17Chandigarh
Delhi
Kanpur
Bombay
Bangalore
18IIT BombayIndian Institute of Technology
http//www.iitb.ac.in/
- Department of Mathematics
- Department of Electrical Engineering
19http//www.iisc.ernet.in/
- Department of Mathematics
- Finite fields and Coding Theory classification of
permutation polynomials, study of PAPR of
families of codes, construction of codes with low
PAPR.
peak-to-average power
20IIT KanpurIndian Institute of Technology
http//www.iitk.ac.in/
21Department of Mathematics
http//www.puchd.ac.in/
22http//www.du.ac.in/
- Department of Mathematics
23Error Correcting Codesby Priti Shankar
http//www.ias.ac.in/resonance/
- How Numbers Protect Themselves
- The Hamming Codes Volume 2 Number 1
- Reed Solomon Codes Volume 2 Number 3
24Playing cards
25I know which card you selected
- Among a collection of playing cards, you select
one without telling me which one it is. - I ask you some questions where you answer yes or
no. - Then I am able to tell you which card you
selected.
262 cards
- You select one of these two cards
- I ask you one question and you answer yes or no.
- I am able to tell you which card you selected.
272 cards one question suffices
284 cards
29First question is-it one of these two?
30Second question is-it one of these two ?
314 cards 2 questions suffice
Y Y
Y N
N Y
N N
328 Cards
33First question is-it one of these?
34Second question is-it one of these?
35Third question is-it one of these?
368 Cards 3 questions
YYY
YYN
YNY
YNN
NYY
NYN
NNY
NNN
37Yes / No
- 0 / 1
- Yin / Yang - -
- True / False
- White / Black
- / -
- Heads / Tails (tossing a coin)
383 questions, 8 solutions
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
3916 Cards
- If you select one card among a set of 16, I
shall know which one it is, once you answer my 4
questions by yes or no.
40(No Transcript)
41Label the 16 cards
42In binary expansion
43Ask the questions so that the answers are
44The 4 questions
- Is the first digit 0 ?
- Is the second digit 0 ?
- Is the third digit 0 ?
- Is the fourth digit 0 ?
45More difficult
46One answer may be wrong
- Consider the same problem, but you are allowed to
give (at most) one wrong answer. - How many questions are required so that I am able
to know whether your answers are right or not?
And if they are right, to know the card you
selected?
47Detecting one mistake
- If I ask one more question, I shall be able to
detect if there is one of your answers which is
not compatible with the others. - And if you made no mistake, I shall tell you
which is the card you selected.
48Detecting one mistake with 2 cards
- With two cards I just repeat twice the same
question. - If both your answers are the same, you did not
lie and I know which card you selected - If your answers are not the same, I know that one
is right and one is wrong (but I dont know which
one is correct!).
494 cards
50First question is-it one of these two?
51Second question is-it one of these two?
52Third question is-it one of these two?
534 cards 3 questions
Y Y Y
Y N N
N Y N
N N Y
544 cards 3 questions
0 0 0
0 1 1
1 0 1
1 1 0
55Correct triple of answers
Wrong triple of answers
One change in a correct triple of answers yields
a wrong triple of answers
56Boolean addition
- even even even
- even odd odd
- odd even odd
- odd odd even
57Parity bit
- Use one more bit which is the Boolean sum of the
previous ones. - Now for a correct answer the sum of the bits
should be 0. - If there is exactly one error, the parity bit
will detect it the sum of the bits will be 1
instead of 0.
588 Cards
594 questions for 8 cards
Use the 3 previous questions plus the parity bit
question
60First question is-it one of these?
61Second question is-it one of these?
62Third question is-it one of these?
63Fourth question is-it one of these?
6416 cards, at most one wrong answer 5 questions
to detect the mistake
65Ask the 5 questions so that the answers are
66Correcting one mistake
- Again I ask you questions where your answer is
yes or no, again you are allowed to give at most
one wrong answer, but now I want to be able to
know which card you selected - and also to tell
you whether and when you lied.
67With 2 cards
- I repeat the same question three times.
- The most frequent answer is the right one vote
with the majority.
68With 4 cards
I repeat my two questions three times each,
which makes 6 questions
69With 8 Cards
I repeat 3 times my 3 questions, which makes 9
questions
70With 16 cards, this process requires 3?412
questions
7116 cards, 7 questions
- We shall see that 7 questions allow to recover
the exact result if there is at most one wrong
answer.
72Error correcting codes
73Coding Theory
- Coding theory is the branch of mathematics
concerned with transmitting data across noisy
channels and recovering the message. Coding
theory is about making messages easy to read
don't confuse it with cryptography which is the
art of making messages hard to read!
74Claude Shannon
- In 1948, Claude Shannon, working at Bell
Laboratories in the USA, inaugurated the whole
subject of coding theory by showing that it was
possible to encode messages in such a way that
the number of extra bits transmitted was as small
as possible. Unfortunately his proof did not give
any explicit recipes for these optimal codes.
75Richard Hamming
- Around the same time, Richard Hamming, also at
Bell Labs, was using machines with lamps and
relays having an error detecting code. The digits
from 1 to 9 were send on ramps of 5 lamps with
two on and three out. There were very frequent
errors which were easy to detect and then one had
to restart the process.
76The first correcting codes
- For his researches, Hamming was allowed to have
the machine working during the week-end only, and
they were on the automatic mode. At each error
the machine stopped until the next monday
morning. - "If it can detect the error," complained
Hamming, "why can't it correct it! "
77The origin of Hammings code
- He decided to find a device so that the machine
would not only detect the errors but also correct
them. - In 1950, he published details of his work on
explicit error-correcting codes with information
transmission rates more efficient than simple
repetition. - His first attempt produced a code in which four
data bits were followed by three check bits which
allowed not only the detection but the correction
of a single error.
78(No Transcript)
79Codes and Geometry
- 1949 Marcel Golay (specialist of radars)
produced two remarkably efficient codes. - Eruptions on Io (Jupiters volcanic moon)
- 1963 John Leech uses Golays ideas for sphere
packing in dimension 24 - classification of
finite simple groups - 1971 no other perfect code than the two found by
Golay.
80Error Correcting Codes Data Transmission
- Telephone
- CD or DVD
- Image transmission
- Sending information through the Internet
- Radio control of satellites
81Applications of error correcting codes
- Transmitions by satellites
- Compact discs
- Cellular phones
82- Olympus Mons on Mars Planet
- Image from Mariner 2 in 1971.
83- Information was sent to the earth using an
error correcting code which corrected 7 bits
among 32. - In each group of 32 bits, 26 are control bits
and the 6 others contain the information.
84- Between 1969 and 1973 the NASA Mariner probes
used a powerful Reed--Muller code capable of
correcting 7 errors out of 32 bits transmitted,
consisting now of 6 data bits and 26 check bits!
Over 16,000 bits per second were relayed back to
Earth.
The North polar cap of Mars, taken by Mariner 9
in 1972.
85Voyager 1 and 2 (1977)
- Journey Cape Canaveral, Jupiter, Saturn, Uranus,
Neptune. - Sent information by means of a binary code which
corrected 3 errors on words of length 24.
86Mariner spacecraft 9 (1979)
- Sent black and white photographs of Mars
- Grid of 600 by 600, each pixel being assigned one
of 64 brightness levels - Reed-Muller code with 64 words of 32 letters,
minimal distance 16, correcting 7 errors, rate
3/16
87Voyager (1979-81)
- Color photos of Jupiter and Saturn
- Golay code with 4096212 words of 24 letters,
minimal distance 8, corrects 3 errors, rate 1/2. - 1998 lost of control of Soho satellite recovered
thanks to double correction by turbo code.
88NASA's Pathfinder mission on Mars
- The power of the radio transmitters on these
craft is only a few watts, yet this information
is reliably transmitted across hundreds of
millions of miles without being completely
swamped by noise.
Sojourner rover and Mars Pathfinder lander
89Listening to a CD
- On a CD as well as on a computer, each sound is
coded by a sequence of 0s and 1s, grouped in
octets - Further octets are added which detect and correct
small mistakes. - In a CD, two codes join forces and manage to
handle situations with vast number of errors.
90Coding the sound on a CD
On CDs the signal in encoded digitally. To guard
against scratches, cracks and similar damage,
two "interleaved" codes which can correct up to
4,000 consecutive errors (about 2.5 mm of track)
are used.
- Using a finite field with 256 elements, it is
possible to correct 2 errors in each word of 32
octets with 4 control octets for 28 information
octets.
91A CD of high quality may have more than 500
000 errors!
- After processing of the signal in the CD player,
these errors do not lead to any disturbing noise. - Without error-correcting codes, there would be no
CD.
921 second of radio signal 1 411 200 bits
- The mathematical theory of error correcting codes
provides more reliability and at the same time
decreases the cost. It is used also for data
transmission via the internet or satellites
93Codes and Mathematics
- Algebra
- (discrete mathematics finite fields, linear
algebra,) - Geometry
- Probability and statistics
94Finite fields and coding theory
- Solving algebraic equations with
radicals Finite fields theory
Evariste Galois
(1811-1832) - Construction of regular polygons with rule and
compass - Group theory
Srinivasa Ramanujan (1887-1920)
95Coding Theory
transmission
Source Coded Text Coded Text Receiver
96Error correcting codes
Transmission with noise
Source Coded Text Coded Text Receiver
97 Principle of coding theory
- Only certain words are allowed (code
dictionary of valid words). - The  useful letters carry the information,
the other ones (control bits) allow detecting or
correcting errors.
98Detecting one error by sending twice the message
- Send twice each bit
- 2 code words among 422 possible words
- (1 useful letter among 2)
- Code words
- (two letters)
- 0 0
- and
- 1 1
- Rate 1/2
99- Principle of codes detecting one error
-
- Two distinct code words
- have at least two distinct letters
-
100Detecting one error with the parity bit
- Code words (three letters)
- 0 0 0
- 0 1 1
- 1 0 1
- 1 1 0
- Parity bit (x y z) with zxy.
- 4 code words (among 8 words with 3 letters),
- 2 useful letters (among 3).
- Rate 2/3
2
101Code Words Non Code Words
- 0 0 0 0 0 1
- 0 1 1 0 1 0
- 1 0 1 1 0 0
- 1 1 0 1 1 1
- Two distinct code words
- have at least two distinct letters.
2
102Check bit
- In the International Standard Book Number (ISBN)
system used to identify books, the last of the
ten-digit number is a check bit. - The Chemical Abstracts Service (CAS) method of
identifying chemical compounds, the United States
Postal Service (USPS) use check digits. - Modems, computer memory chips compute checksums.
- One or more check digits are commonly embedded in
credit card numbers.
103Correcting one errorby repeating three times
- Code words
- (three letters)
- 0 0 0
- 1 1 1
- Rate 1/3
- Send each bit three times
- 2 code words
- among 8 possible ones
- (1 useful letter among 3)
104- Correct 0 0 1 as 0 0 0
- 0 1 0 as 0 0 0
- 1 0 0 as 0 0 0
- and
- 1 1 0 as 1 1 1
- 1 0 1 as 1 1 1
- 0 1 1 as 1 1 1
105- Principle of codes correcting one error
-
- Two distinct code words have at least three
distinct letters -
106Hamming Distance between two words
- number of places in which the two words
- differ
- Examples
- (0,0,1) and (0,0,0) have distance 1
- (1,0,1) and (1,1,0) have distance 2
- (0,0,1) and (1,1,0) have distance 3
- Richard W. Hamming (1915-1998)
107Hamming distance 1
108Hammings unit sphere
- The unit sphere around a word includes the words
at distance at most 1
109At most one error
110Words at distance at least 3
111Decoding
112The code (0 0 0) (1 1 1)
- The set of words with three letters (eight
elements) splits into two balls - The centers are (0,0,0) and (1,1,1)
- Each of the two balls consists of elements at
distance at most 1 from the center
113Two or three 0
Two or three 1
- (0,0,1)
- (0,0,0) (0,1,0)
- (1,0,0)
- (1,1,0)
- (1,1,1) (1,0,1)
- (0,1,1)
114 Generalization of the check bitfor correcting
one error
- The check bit allows to detect one error by
adding one bit. - Is-it possible to extend this idea in order to
correct one error? - Answer by Hamming in 1950 YES!
115Hammings code
116How to compute e , f , g from a , b , c , d.
eabd
d
a
b
facd
c
gabc
117Hamming code
- Words of 7 letters
- Code words (1624 among 12827)
- (a b c d e f g)
- with
- eabd
- facd
- gabc
- Rate 4/7
11816 code words of 7 letters
- 0 0 0 0 0 0 0
- 0 0 0 1 1 1 0
- 0 0 1 0 0 1 1
- 0 0 1 1 1 0 1
- 0 1 0 0 1 0 1
- 0 1 0 1 0 1 1
- 0 1 1 0 1 1 0
- 0 1 1 1 0 0 0
- 1 0 0 0 1 1 1
- 1 0 0 1 0 0 1
- 1 0 1 0 1 0 0
- 1 0 1 1 0 1 0
- 1 1 0 0 0 1 0
- 1 1 0 1 1 0 0
- 1 1 1 0 0 0 1
- 1 1 1 1 1 1 1
Two distinct code words have at least three
distinct letters
119The binary code of Hamming (1950)
-
- It is a linear code (the sum of two code words
is a code word) and the 16 balls of radius 1 with
centers in the code words cover all the space of
the 128 binary words of length 7 - (each word has 7 neighbors (71)?16 256).
120Playing cards
1217 questions to find the selected card among 16,
with one possible wrong answer
- Replace the cards by labels from 0 to 15 and
write the binary expansions of these - 0000, 0001, 0010, 0011
- 0100, 0101, 0110, 0111
- 1000, 1001, 1010, 1011
- 1100, 1101, 1110, 1111
- Using the Hamming code, get 7 digits.
- Select the questions so that Yes0 and No1
1227 questions to find the selected number in
0,1,2,,15 with one possible wrong answer
- Is the first binary digit 1?
- Is the second binary digit 1?
- Is the third binary digit 1?
- Is the fourth binary digit 1?
- Is the number in 1,2,4,7,9,10,12,15?
- Is the number in 1,2,5,6,8,11,12,15?
- Is the number in 1,3,4,6,8,10,13,15?
123The Hat Problem
124The Hat Problem
- Three people are in a room, each has a hat on his
head, the color of which is black or white. Hat
colors are chosen randomly. Everybody sees the
color of the hat on everyones head, but not on
their own. People do not communicate with each
other. - Everyone gets to guess (by writing on a piece of
paper) the color of their hat. They may write
Black/White/Abstain.
125- The people in the room win together or lose
together. - The team wins if at least one of the three people
did not abstain, and everyone who did not abstain
guessed the color of their hat correctly. - How will this team decide a good strategy with a
high probability of winning?
126- Simple strategy they agree that two of them
abstain and the other guesses randomly. - Probability of winning 1/2.
- Is it possible to do better?
127- Hint
- Improve the odds by using the available
information everybody sees the color of the hat
on everyones head but himself.
128- Better strategy if a member sees two different
colors, he abstains. If he sees the same color
twice, he guesses that his hat has the other
color.
129- The two people with white hats see one white
hat and one black hat, so they abstain.
The one with a black hat sees two white hats,
so he writes black.
They win!
130- The two people with black hats see one white
hat and one black hat, so they abstain.
The one with a white hat sees two black hats,
so he writes white.
They win!
131 Everybody sees two white hats, and therefore
writes black on the paper.
132 Everybody sees two black hats, and therefore
writes white on the paper.
133two white or two black
134three white or three black
135- The team wins exactly when the three hats do not
have all the same color, that is in 6 cases out
of a total of 8 - Probability of winning 3/4.
136Connection with error detecting codes
- Replace white by 0 and black by 1
- hence the distribution of colors becomes a
word of three letters on the alphabet 0 , 1 - Consider the centers of the balls (0,0,0) and
(1,1,1). - The team bets that the distribution of colors is
not one of the two centers.
137Assume the distribution of hats does not
correspond to one of the centers (0, 0, 0) and
(1, 1, 1). Then
- One color occurs exactly twice (the word has both
digits 0 and 1). - Exactly one member of the team sees twice the
same color this corresponds to 0 0 in case he
sees two white hats, 1 1 in case he sees two
black hats. - Hence he knows the center of the ball (0 , 0 ,
0) in the first case, (1, 1, 1) in the second
case. - He bets the missing digit does not yield the
center.
138- The two others see two different colors, hence
they do not know the center of the ball. They
abstain. - Therefore the team wins when the distribution of
colors does not correspond to one of the centers
of the balls. - This is why the team wins in 6 cases.
139- Now if the word corresponding to the distribution
of the hats is one of the centers, all members of
the team bet the wrong answer! - They lose in 2 cases.
140Hat problem with 7 people
For 7 people in the room in place of 3, which is
the best strategy and its probability of
winning?
Answer the best strategy gives a probability
of winning of 7/8
141The Hat Problem with 7 people
- The team bets that the distribution of the hats
does not correspond to the 16 elements of the
Hamming code - Loses in 16 cases (they all fail)
- Wins in 128-16112 cases (one bets correctly, the
6 others abstain) - Probability of winning 112/1287/8
142SPORT TOTO the oldest error correcting code
- A match between two players (or teams) may give
three possible results either player 1 wins, or
player 2 wins, or else there is a draw. - There is a lottery, and a winning ticket needs to
have at least three correct bets. How many
tickets should one buy to be sure to win?
1434 matches, 3 correct forecasts
- For 4 matches, there are 34 81 possibilities.
- A bet on 4 matches is a sequence of 4 symbols
1, 2, x. Each such ticket has exactly 3 correct
answers 8 times. - Hence each ticket is winning in 9 cases.
- Therefore a minimum of 9 tickets is required to
be sure to win.
1449 tickets (in columns)
- x x x 1 1 1 2 2
2 - x 1 2 x 1 2 x
1 2 - x 1 2 1 2 x 2 x
1 - x 1 2 2 x 1 1 2
x
This is an error correcting code on the
alphabet 1,2,x with rate 1/2
145Sphere Packing
- While Shannon and Hamming were working on
information transmission in the States, John
Leech invented similar codes while working on
Group Theory at Cambridge. This research included
work on the sphere packing problem and culminated
in the remarkable, 24-dimensional Leech lattice,
the study of which was a key element in the
programme to understand and classify finite
symmetry groups.
146Sphere packing
The kissing number is 12
147Sphere Packing
- Kepler Problem maximal density of a packing of
identical sphères -  p / Ö 18 0.740 480 49
- Conjectured in 1611.
- Proved in 1999 by Thomas Hales.
- Connections with crystallography.
148Current trends
- In the past two years the goal of finding
explicit codes which reach the limits predicted
by Shannon's original work has been achieved. The
constructions require techniques from a
surprisingly wide range of pure mathematics
linear algebra, the theory of fields and
algebraic geometry all play a vital role. Not
only has coding theory helped to solve problems
of vital importance in the world outside
mathematics, it has enriched other branches of
mathematics, with new problems as well as new
solutions.
149Directions of research
- Theoretical questions of existence of specific
codes - connection with cryptography
- lattices and combinatoric designs
- algebraic geometry over finite fields
- equations over finite fields
150Some recent results in mathematics related to
data transmission
- Michel Waldschmidt
- Université P. et M. Curie - Paris VI
- Centre International de Mathématiques Pures et
Appliquées - CIMPA
India, October-November 2007
http//www.math.jussieu.fr/miw/