Title: Guillaume Segerer CNRS - LLACAN - France segerer@vjf.cnrs.fr
1Guillaume SegererCNRS - LLACAN -
Francesegerer_at_vjf.cnrs.fr
- Niger-Congo Languages as a playground for lexical
comparison
LYON, May 12-14, 2008 New Directions in
Historical Linguistics Paper presented May
14 Document revised May 16
2Languages of Africa
3Niger-Congo Languages
4Niger-Congo clusters
5The experiment
- The present experiment consists in
- - testing the validity of the Niger-Congo phylum
by measuring its homogeneity - - doing real mass comparison 506 languages
examined ( 1/3 of all NC languages) - - with only a few lexical roots, chosen
intuitively from empirical experience
It can be further refined by - considering more
languages (more data is actually available) -
chosing different lexical roots (but how ?) -
taking into account adjacent phyla (Nilo-Saharan,
Afro-Asiatic, Khoisan)
6Language Sample (506 lgs)
710 supposedly common NC lexical roots
- 1 TU(P) to spit
- 2 MED MOD to swallow
- 3 NYU to drink
- 4 DUM to bite
- 5 TE tree
- 6 NYI(N) tooth
- 7 TU ear
- 8 DEM tongue
- 9 DI to eat
- 10 TAT three
8Distribution of root 1
9Distribution of root 2
10Distribution of root 3
11Distribution of root 4
12Distribution of root 5
13Distribution of root 6
14Distribution of root 7
15Distribution of root 8
16Distribution of root 9
17Distribution of root 10
18Weighted sample
19Probabilities 1
consonants labial symbol P dental/coronal
symbol T palatal symbol C velar/uvular symbol
K
vowels front symbol I central symbol A back
symbol U
example DEM tongue coded as TIP probability 1/4
x 1/3 x 1/4 1/48 tolerance IA gt new prob.
1/24 Out of the 506 sample languages, 1/24 21
languages may by chance have a word for tongue
of the shape TIP TAP
20Probabilities 2
Probabilities for each of the 10 roots TAT -
three TAT TIT gt 1/24 DUM - bite TUP TUT
gt 1/24 DEM - tongue TIP TAP gt 1/24 MED
MOD - swallow PIT PUT gt 1/24 TU(P) - spit
TU CU gt 1/6 TU - ear TU CU gt 1/6 NYI(N) -
tooth CI TI gt 1/6 TE - tree TI TA CI gt
1/4 DI - eat TI CA CI gt 1/4 NYU - drink
CU TU KUgt 1/3
probability to have all 10 items 1/ 143 327
232 4 languages in the sample have all 10 items
Akpafu (Kwa), Sukuma (Bantu F21, Runyankore
(Bantu E13), Andoni (Benue-Congo)
21Probabilities 3
probability to have at least 1 item 18 of 1565
lgs gt 286 lgsprobability to have at least 2
items 19 27 of 286 lgs gt 55 79
lgsprobability to have at least 3 items 29
37 of 55 79 lgs gt 16 29 lgs
22Some questions...
- Can this method be used to classify a language ?
- What is the minimal number of items needed to
identify a language cluster ? - Is there a method (other than intuitive) to
identify these items ? - Can this technique be applied to any language
family / cluster ? - What are the implications of these phenomena ?
- ...
23A restricted distribution
GOP possible Atlantic lexical innovation