Title: Decipherment, Pseudodecipherment and the Phaistos Disk
1Decipherment, Pseudodecipherment and the Phaistos
Disk
- Richard Sproat
- September 6, 2007
- Linguistics Department Seminar
2(No Transcript)
3The Phaistos Disk
- Discovered July 3, 1908, by the Italian
excavation team at Phaistos (Fa?st?? ),
Crete, headed by Luigi Pernier - Found in a set of buildings off to the northwest
end of the Phaistos palace site - A tablet in Minoan Linear A was found nearby
- Thought to date from roughly 1800 BC (middle of
the late Minoan bronze age)
4The text
- 241 tokens with 45 distinct glyphs
- Glyphs are all pictographic images of animals,
people, various objects - Text is on both sides of disk in a spiral working
from the outside - The Phaistos Disk is the worlds first known
printed document - Text is broken into 61 (31/30) regions separated
by vertical bars. - There is no other artifact known to be written in
the same script
5(No Transcript)
6Decipherments
- There have been well over 20 published
decipherments. Some of the proposed languages - Greek (most common)
- Basque
- Sanskrit
- Chinese (!)
- One published argument that it is pseudowriting
- A couple of suggestions that it was a calendar
- A few published arguments that its a fake
- John Chadwick described the Disk as a permanent
thorn in the flesh of Minoan epigraphists, and
considered it to be undecipherable.
7The Disk in the popular press
1984 National Geographic honors Fischer with an
all-expenses-paid trip to Washington from
Germany for his decipherment of the Disk
8The nature of the script
- Most would-be decipherers have assumed the script
is more or less of the same type as Linear A and
Linear B a V/CV syllabary - More on how these work momentarily
- Arguments are based on
- the apparent number of symbols in the inscription
and - the putative relationships with other scripts of
the region
9Text of the disk, with Evans glyph numbers
Side A 02-12-13-01-18/ 24-40-12 29-45-07/
29-29-34 02-12-04-40-33 27-45-07-12 27-44-08
02-12-06-18-? 31-26-35 02-12-41-19-35 01-41-40-07
02-12-32-23-38/ 39-11 02-27-25-10-23-18 28-01/
02-12-31-26/ 02-12-27-27-35-37-21 33-23
02-12-31-26/ 02-27-25-10-23-18 28-01/
02-12-31-26/ 02-12-27-14-32-18-27 06-18-17-19
31-26-12 02-12-13-01 23-19-35/ 10-03-38
02-12-27-27-35-37-21 13-01 10-03-38 Side
B 02-12-22-40-07 27-45-07-35 02-37-23-05/
22-25-27 33-24-20-12 16-23-18-43/ 13-01-39-33
15-07-13-01-18 22-37-42-25 07-24-40-35
02-26-36-40 27-25-38-01 29-24-24-20-35 16-14-18
29-33-01 06-35-32-39-33 02-09-27-01 29-36-07-08/
29-08-13 29-45-07/ 22-29-36-07-08/ 27-34-23-25
07-18-35 07-45-07/ 07-23-18-24 22-29-36-07-08/
09-30-39-18-07 02-06-35-23-07 29-34-23-25 45-07/
10Remainder of this talk
- Quick introduction to Linear B
- A couple of recent decipherments of the Disk
- Fischer
- Faucounau
- A proposal by Kevin Knight for autodecipherment
- Evaluation of this proposal on Linear B
- Application of a similar technique to the Disk
- The criteria for decipherment
11Linear B
12(No Transcript)
13History
- Arthur Evans discovered the first Linear A and
Linear B tablets at Knossos starting in 1900 - Linear B dates from around 1450 BC
- Evans was convinced that neither Linear A nor
Linear B could be Greek - Oddly he came close to the opposite (correct)
conclusion for Linear B when he decoded po-lo
as a probable word for horse (Gk. polos, En.
foal) - But Carl Blegens discovery in 1939 of Linear B
tablets on the Greek mainland brought that
assumption into question
14Decipherment
Introducing the Minoan Language M. G. F. Ventris
American Journal of Archaeology, Vol. 44, No. 4
(Oct. - Dec., 1940), pp. 494-520
Indeed, Ventris resisted the idea that Linear B
was Greek almost up to the time of his eventual
decipherment in 1952.
15Stages of decipherment
ru ki to ru ki ti jo ru ki ti ja Luktos
a mi ni so a mi ni si jo a mi ni si ja Amnisos
- Kobers triplets
- Ventris grid
16Linear B examples
17Confirmation
- The phonology of many words corresponded to what
was suspected for Greek from the relevant period - wa-na-ka (wanaks, later anax ruler)
- i-qo (iqqwos, later hippos horse)
- No definite articles
- Confirmation from new finds by Blegen
-
ti ri po de
qe to ro we
qwetrowes ? tetr-
18Completeness of the decipherment
19The Disk Fischers decipherment (Greek)
"Hear ye, Cretans and Greeks my great, my quick!
Hear ye, Danaidans, the great the worthy! Hear
ye, all blacks, and hear ye, Pudaan and Libyan
immigrants! Hear ye, waters, yea earth Hellas
faces battle with the Carians. Hear ye all! Hear
ye, Gods of the Fleet, aye hear yea all faces
battle with the Carians. Hear ye all! Hear ye,
the multitudes of black people and all! Hear ye,
lords, yea freemen To Naxos! Hear ye, Lords of
the Fleet To Naxos!"
20Faucounaus decipherment (as Ionic Greek)
21Faucounaus arguments
- No need of a 2nd disk to confirm solution
- ...nous pensions au départ qu'il ne serait
possible que lorsque l'on disposerait d'un second
disque, auquel on pourrait appliquer les valeurs
phonétiques trouvées. Mais le nombre et surtout
la qualité des preuves que nous avons pu
découvrir au fil des années nous font considérer
aujourd'hui comme superflue la mise au jour d'un
second disque, - Internal proofs
- Coincidence of the sounds derived for the forms
via statistical analysis and acrophonic
principle - Evidence based on corrected typos in the disk
- External proofs
- General arguments to the effect that the Ionians
could have been in Crete at the time
ka from kare head or maybe Kar Carian (
Philistine)
22La lamproie
X
ka
s
la
lae
yi
to
- On voit encore nettement, sur les photos
agrandies, la queue de lanimal avec ses
écailles. - Why a lamprey? Many more obvious words like
lampas lamp - And theres a slight problem with the biology
23Autodecipherment Knights proposal
- Assume a standard source-channel language
modeling approach - Script form is the observation S
- Language model L over sequences of phonemes in
target language is the source - Noisy channel C is the spelling rules mapping
between the language and the script - The decoding problem is to find the optimal
solution in S?C?L - Use Expectation-Maximization to solve this
problem - Solution will have the lowest cross-entropy
- Actually this is an old idea, dating back to the
early work on HMMs at the IDA, and Shannons
work on codebreaking
24Example Spanish
- Target ancient text is the first page of Don
Quijote - Language model is built over phonetically
transcribed medical text - Initial channel model allows any sound to map to
any character with equal probability. - Task is to learn the weights on the mappings
- Final result decodes 96 of the sounds
- 99 phoneme accuracy for Japanese kana
- 22 syllable accuracy for Chinese
- Has also shown that you can crack substitution
ciphers, and find the correct language among a
couple of handfuls of candidate languages.
25Issues
- In most real cases we dont necessarily know the
underlying language - or the form of the underlying language
- Ancient scripts often encode phonological
information in complex ways - Many scripts are mixed they encode both sound
and aspects of the meaning
26Application to Linear B Could Ventris have been
replaced by a computer?
Loss of most voicing aspiration
distinctions bP -gt p T -gt t d -gt dt Kg -gt
k Final consonant deletion Cons -gt
ltepsilongt / __ lteosgt Son -gt ltepsilongt /
__ Cons s -gt ltepsilongt / __
Obs right-to-left ltepsilongt -gt e / Obs _
Cons e ltepsilongt -gt a / Obs _ Cons
a ltepsilongt -gt i / Obs _ Cons i ltepsilongt
-gt o / Obs _ Cons o ltepsilongt -gt u / Obs
_ Cons u
- Tri-syllable language model built on 1.6 million
words of Greek from Homer to the Hellenistic
period - Mapped back to guesses about Mycenean forms using
two kinds of rules - Historical reconstruction
- Phonological simplification
- Mycenaean data is 12,730 syllables from Ventris
Chadwick Documents in Mycenaean Greek - Removed all ideograms, uncertain cases, and
phrases that contained the syllables pte, nwa
?ess??? ????a??? me se ne wa te na i o ????t?
µet? t?? ????? we ke re to me ta to wa ro
t -gt q / __ e p -gt q / __ (ioua) ltepsilongt
-gt w / ltbosgt __ Vow
27Procedure
- Randomly select mapping between truth
(Ventris/Chadwick syllables) and some permutation
of the syllables. - E.g. map to ? ka, pe ? ro, nu ? mu
- 61 syllables so 61! possible permutations
- Measure the cross-entropy between the language
model and the resulting permuted text
28Results
p lt 1.69 x 10-9 (5.52 s.d.) 8.58 x 1075
truth
optimal
29Best cross-entropy
to to e ka o e ro i ke ta ko u jo o te we a
te na qe re ro ja ke pa me ta re ra pe pe ko me ra
Substitute most common syllable according to
language model training data for the most common
in the Ventris/Chadwick corpus, second most
common for second most common, and so forth
30Application of autodecipherment to the Phaistos
disk
- Use the same Greek language model training data
as before. - Build a bigram word language model L
- Assume an Aegean syllabary decoding for the Disk
- Use same simple frequency-based match between
Disk glyphs and the LM-derived syllables for the
channel model C - For each segment of the disk, compose the text T
with C?L to get a set of possible translations - Formally we define a possible translation t as an
element of p2 (T?C?L )
31First 11 glyphs
- to
- ka
- e
- i
- ta
- u
- o
- we
- te
- qe
- ro
32One possible decipherment
33 1 to e ko te
ta t? ??? d?ta therefore I certainly
hold 2 pe ke
e p??ee? the horns of the lyre
3 o ro ka ??? ??? for I
see 4 o o
si ?s?? with (my) ears
5 to e qi ke me t? ?pp?? ???
µ?? therefore verily the horses of the
earth 6 i ro
ka e ???? ??? e? for you are at Ilium
34 7 i mo no ??
µ???? him alone
8 to e ri ta t?? ???da the strife
9 ra re u ???e ?? do not
babble (about)
10 to e ja na u t?? e?a ?a?? the
ship! 11 te
ja ke ka ?e?a ????? godly sacrificial
ooze 12 to e
wo we a t? ???? ?a? in the spring at daybreak
35 13 tu
su ???s??? the thyrsi
14 to i qe ne we ta t??s-te ??
?st?a? and for these, by Hestia!
15 wa te ?ste O
citizen! 16
to e ra re t? ????e? this you have
babbled 17
to e i i u qo sa t?? e??µ ?? p?sa?? this I am not
how much? 18
me we µ? ? not ??
36 19 to e ra
re t? ????e? this you have babbled
20 to i qe ne we ta t??s-te
?? ?st?a? and for these, by Hestia!
21 wa te ?ste O
citizen! 22
to e ra re t? ????e? this you have
babbled 23
to e i pa wo ta i t? e?pa? ?ta? ?? you said it
whenever 24
ri ta so na ??pas???a?tossing about
37 25 ra re
e ?a?e?? you babbled
26 to e ko te t?? ????te? having
this 27 we
na u ?? ?a?? ?? ship
28 ne ku a ????a? the dead
29 to e i i u qo sa t??
e??µ ?? p?sa?? this I am not how
much? 30 ko
te s???te? having
31 ne ku a ????a? the dead
38Translation of side A
- Yea I hold the horns of the lyre, for I see
with mine ears, oh horses of Gea! You are at
Ilium, yet do not go on about the strife, it
alone, but the ship! And the holy seepings of the
sacrifice in the Spring at daybreak, the thyrsi,
for these by Hestia! O citizen, you have gone on
about this! How much is this not me!, this you
have gone on about. For these at daybreak, by
Hestia!, O citizen, this you have gone on about.
You said this whenever (you wished?), you babbled
about those things tossed about, those things
holding this ship and the dead --- how much is
this not me those things that hold the dead.
39Characteristic Symptoms of Pathological Science
These are cases where there is no dishonesty
involved but where people are tricked into false
results by a lack of understanding about what
human beings can do to themselves in the way of
being led astray by subjective effects, wishful
thinking or threshold interactions. These are
examples of pathological science. Irving
Langmuir, March 8, 1954
40The acrophonic principle
- ka lt karehead
- o/u lt outhar udder
- s lt aspis round shield
- te lt derma hide
- syu lt ksulon wood
- i lt iketerie k.o. branch
- a lt agrios wild beast
- po/pu lt polos axe
- ko/ku lt kouros boy
- re lt reu-naus row boat
- ri lt rhinos hide
- to lt tolma bold
- ka lt kalyx bud
- e lt eptakotulos seven cupped
- i lt iktide weasel skin
- ta lt stathme carpenters line
- u lt hule forest
- o lt ozo smell ?
- we lt wergaleion tool, instrument
- te lt theo run
- qe lt kellomai be driven ashore
- ro lt rhoe river ?
41And anotherbased in Linear B frequencies
to o pe a
ke t?? ?spe? ???e?? lt/sgt 24.5945835 t? ?spe?
???e?? lt/sgt 24.9219341 t?? ?spe? ??e?? lt/sgt
25.0299015 t?? ??f?a ?e lt/sgt 25.0973282 t??
??pe? ???e?? lt/sgt 25.2481556 t? ?spe? ??e??
lt/sgt 25.3572521 t?? ?spe? ???e?? lt/sgt
25.4204693 t? ??pe? ???e?? lt/sgt 25.5755043
ra ja o
Infinity j
o re e Infinity
jo jo de Infinity
to o mi ja
pa t?? ?? µ?a Fa lt/sgt 30.3845329 t? ?? µ?a
Fa lt/sgt 30.7118835 t?? ?? µ?a? Fa lt/sgt
30.9829865 t?? ?? µ?a Fa lt/sgt 31.2104225 t??
?? µ?a pa? lt/sgt 31.2393837 t? ?? µ?a? Fa lt/sgt
31.3103371 t? ?? µ?a? Fa lt/sgt
31.4106636 t?? ?? µ?a? Fa lt/sgt 31.4721298
ro re e
o ??? ??e? ? lt/sgt 29.3632221 ??? ??e? ?
lt/sgt 29.3640156 ??? ??e d ?? lt/sgt
30.2413788 ??? ??e d ?? lt/sgt 30.2421703 ??
??e? ? lt/sgt 30.8604527 ??? ??e? ? lt/sgt
30.8733959 ??? ??e? ?? lt/sgt 30.9695072 ???
??e? ?? lt/sgt 30.9702988 ??? ??e? ? lt/sgt
31.0441494
42English
te ta ra e
re they try l lt/sgt 22.6965694 they dry
air lt/sgt 23.1506157 they try air lt/sgt
24.0828629 they try hair lt/sgt
24.4623528 they try error lt/sgt
24.5423965 they try e.l lt/sgt
25.0656433 they try hell lt/sgt
25.5339279 they dry l lt/sgt 26.6185932 day
dry air lt/sgt 26.9243355
ri ke
ta league the lt/sgt 17.5926361 rick the
lt/sgt 17.7687664 league to lt/sgt 18.4728813 link
to lt/sgt 18.6020393 risk the lt/sgt
18.6125088 rick to lt/sgt 18.6490097 leake the
lt/sgt 19.3437824 risk to lt/sgt 19.4927521 linke
the lt/sgt 19.6933136 link the lt/sgt
20.0400772 leake to lt/sgt 20.2240276 linke to
lt/sgt 20.5735569 liquor the lt/sgt
20.650898
se se
ma sense my lt/sgt 23.0299835 sash my lt/sgt
24.3089333 sensor my lt/sgt 25.4414711 sensory my
lt/sgt 25.9548531 censure my lt/sgt
25.9983902 sense) my lt/sgt 27.027256 sash's my
lt/sgt 27.0793304
te ta no ke
we they to know curve lt/sgt
30.5287495 they the no curve lt/sgt
31.0522499 they to no curve lt/sgt
31.9324932 they the know curve lt/sgt
32.5392342 day to know curve lt/sgt
32.8238297 delta no curve lt/sgt
33.3763618 stay to know curve lt/sgt
33.6906891 they dono curve lt/sgt 33.9081154
te ta no ke
we they to know curve lt/sgt
30.5287495 they the no curve lt/sgt
31.0522499 they to no curve lt/sgt
31.9324932 they the know curve lt/sgt
32.5392342 day to know curve lt/sgt
32.8238297 delta no curve lt/sgt
33.3763618 stay to know curve lt/sgt
33.6906891 they dono curve lt/sgt 33.9081154
ne ti i
ta nasty he the lt/sgt 28.0462093 nasty he to
lt/sgt 28.9264545 nerdy he the lt/sgt
30.1457424 natty he the lt/sgt
30.1457787 nettie he the lt/sgt
30.1457939 nerdy he to lt/sgt
31.0259876 natty he to lt/sgt
31.026022 nettie he to lt/sgt
31.0260372 nasty hilda lt/sgt 31.4050636
43Criteria for decipherment
- Plausibility of assumptions about the
script/language mapping - Yes, but this is only as reliable as ones prior
knowledge Linear B encoded Greek in a different
way from the later Cypriot syllabary - Internal structural analysis repeated patterns
at the beginnings of segments on the Disk such as
the frequent suggest a language with a lot
of prefixes. - Perhaps, but what if the segments are not words?
- Plausible-sounding ideas about what the text
might say - But there are always too many ideas
- A complete account of everything in all texts?
- Certainly not
- The only criterion that can be applied is
independent verification. No decipherment should
ever be accepted until this can be established.
44A final thought
- My own view, shared by all serious
scholars, is that the Disk is undecipherable so
long as it remains an isolated document. Only a
large increase in the number of inscriptions will
permit real progress towards a decipherment.
Meanwhile, we must curb our impatience, and admit
that if King Minos himself were to reveal to
someone in a dream the true interpretation, it
would be quite impossible for him to convince
anyone else that his was the one and only
possible solution. - John Chadwick (1987) Linear B and Related
Scripts. University of California Press,
Berkeley. -