Advances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster - PowerPoint PPT Presentation

1 / 178
About This Presentation
Title:

Advances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster

Description:

88. green. 68. sit. 48. hand. 28. skin. 8. not. 87. red. 67. lie ... 54. drink. 34. horn. 14. long. 93. hot. 73. moon. 53. liver. 33. egg. 13. big. 92. night ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 179
Provided by: bak70
Category:

less

Transcript and Presenter's Notes

Title: Advances in Automated Language Classification ASJP Consortium Dik Bakker, Lancaster


1
Advances inAutomatedLanguageClassificationASJ
P ConsortiumDik Bakker, Lancaster
2
Overview
Project ASJP (Automated Similarity Judgment
Program)
3
Overview
Project ASJP (Automated Similarity Judgment
Program)
NUMBERS
LANGUAGE
4
Overview
Project ASJP (Automated Similarity Judgment
Program)
5
Overview
Project ASJP are Sören Wichmann (BRD
Netherlands) Viveka Velupillai (BRD) André
Müller (BRD) Robert Mailhammer (BRD) Hagen
Jung (BRD) Eric Holman (US) Anthony Grant
(UK) Dmitry Egorov (Russia) Pamela Brown
(US) Cecil Brown (US) Dik Bakker (UK
Netherlands)
6
Overview
Project ASJP (Automated Similarity Judgment
Program)
7
Overview
Project ASJP (Automated Similarity Judgment
Program) Overall goal Automatic reconstruction
of language relationships
8
Overview
Project ASJP (Automated Similarity Judgment
Program) Overall goal Automatic reconstruction
of language relationships Basis Distance matrix
between individual languages on basis of
linguistic features
9
Overview
Project ASJP (Automated Similarity Judgment
Program) Overall goal Automatic reconstruction
of language relationships Basis Distance matrix
between individual languages on basis of
linguistic features Method Lexicostatistics
mass comparison of basic lexical items,
extended by typological data
10
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals (a.o)
11
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications
12
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications - Classify newly described and
unclassified languages
13
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications - Classify newly described and
unclassified languages - Search for
(ir)regularities in phylogenies
14
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications - Classify newly described and
unclassified languages - Search for
(ir)regularities in phylogenies - Test hypotheses
(e.g. Atkinson et al 2008 elbow phenomenon)
15
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications - Classify newly described and
unclassified languages - Search for
(ir)regularities in phylogenies - Test hypotheses
(e.g. Atkinson et al 2008 elbow phenomenon) -
Experimentally find an optimal dating method
16
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications - Classify newly described and
unclassified languages - Search for
(ir)regularities in phylogenies - Test hypotheses
(e.g. Atkinson et al 2008 elbow phenomenon) -
Experimentally find an optimal dating method -
Automatically detect borrowings
17
Overview
OVERALL GOAL Reconstruction of Language
Relationships Derived goals - Critical
assessment and refinement of existing
classifications - Classify newly described and
unclassified languages - Search for
(ir)regularities in phylogenies - Test hypotheses
(e.g. Atkinson et al 2008 elbow phenomenon) -
Experimentally find the best/optimal dating
method - Automatically detect borrowings
Today ...
18
Overview
1. The list of basic lexical items

19
Overview
1. The list of basic lexical items
2. Comparing languages
20
Overview
1. The list of basic lexical items 2. Comparing
languages 3. Some results genetic and areal
proximity
21
Overview
1. The list of basic lexical items 2. Comparing
languages 3. Some results genetic and areal
proximity 4. On Inheritance vs Borrowing
22
Overview
1. The list of basic lexical items 2. Comparing
languages 3. Some results genetic and areal
proximity 4. On Inheritance vs Borrowing 5.
Conclusions
23
1. The list of basic lexical items
24
Lexical items
Word list Swadesh 100 basic meanings

25
Lexical items
Word list Swadesh 100 basic meanings - Word
coined in most languages

26
Lexical items
Word list Swadesh 100 basic meanings - Word
coined in most languages - Collected in field
work lexicon / grammar

27
Lexical items
Word list Swadesh 100 basic meanings - Word
coined in most languages - Collected in field
work lexicon / grammar - Inherited rather than
borrowed

28
Lexical items
Word list Swadesh 100 basic meanings - Word
coined in most languages - Collected in field
work lexicon / grammar - Inherited rather than
borrowed - Culturally independent

29
Lexical items
Word list Swadesh 100 basic meanings - Word
coined in most languages - Collected in field
work lexicon / grammar - Inherited rather than
borrowed - Culturally independent - Stable over
time

30
Lexical items
Word list Swadesh 100 basic meanings - Word
coined in most languages - Collected in field
work lexicon / grammar - Inherited rather than
borrowed - Culturally independent - Stable over
time - Few synonyms

31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
Lexical items further reduction
Early analyses have shown - Optimal 40/100 item
subset gives same results

39
Lexical items further reduction
  • Early analyses have shown
  • - Optimal 40/100 item subset gives same results
  • ? Less work


40
Lexical items further reduction
  • Early analyses have shown
  • - Optimal 40/100 item subset gives same results
  • ? Less work
  • ? Less missing data


41
Lexical items further reduction
  • Early analyses have shown
  • - Optimal 40/100 item subset gives same results
  • ? Less work
  • ? Less missing data
  • Faster processing combinatorial explosion
  • 40 100 3 107 2
    1010


42
Lexical items stability
Determine most stable items

43
Lexical items stability
Determine most stable items Iteratively throw
out the most unstable item in terms of variation
within genera (3500-4000 years Dryer 2001
2005) E.g. Germanic, Romance, , Mayan, ,
Sino-T

44
Lexical items stability
Determine most stable items Iteratively throw
out the most unstable item in terms of variation
within genera (3500-4000 years Dryer 2001
2005) E.g. Germanic, Romance, , Mayan, ,
Sino-T Formula S (E - U)/(100 - U) (weighted
average matches Eq vs Uneq)

45
Ethnologue (Goodmann-Kruskal)
WALS (Pearson)
lt Stability gt --
46
(No Transcript)
47
40 Most Stable
48
40 Most Stable
Home page
49
Lexical items transcription
First phase of project (2007) Problems with
full IPA representation of words

50
Lexical items transcription
First phase of project (2007) Problems with
full IPA representation of words - data entry
via keyboard

51
Lexical items transcription
First phase of project (2007) Problems with
full IPA representation of words - data entry
via keyboard - simple programming language
(Fortran Pascal)

52
Lexical items transcription
First phase of project (2007) Problems with
full IPA representation of words - data entry
via keyboard - simple programming language
(Fortran Pascal) ? Recoding to simplified
ASJPcode (only Ascii)

53
Lexical items transcription
ASJPcode

54
Lexical items transcription
ASJPcode 7 Vowels

55
Lexical items transcription
ASJPcode 7 Vowels 34 Consonants

56
Lexical items transcription
ASJPcode 7 Vowels 34 Consonants
Closest sound

57
Lexical items transcription
ASJPcode 7 Vowels 34 Consonants Operators
for Nasalization Labialization Palatalizati
on Aspiration Glottalization

58
Abaza (Caucasian) Meaning PERSON LEAF SKI
N HORN NOSE TOOTH
59
Abaza (Caucasian) Meaning IPA PERSON ????'??
???s LEAF b??? SKIN ??az? HORN ?'???
?a NOSE p?n?'a TOOTH p??
60
Abaza (Caucasian) Meaning IPA ASJPcode PERSON
????'?????s Xw3Cw"yXw3s LEAF b??? bxy3 S
KIN ??az? Cwazy HORN ?'????a Cw"3Xwa NO
SE p?n?'a p3nc"a TOOTH p?? p3c
61
Lexical items
Collected to date - Over 2100 languages,
dialects and proto

62
Lexical items
  • Collected to date
  • - Over 2100 languages, dialects and proto
  • - Mean number of items/language 36.2 (/40)


63
Lexical items
Areal distribution (not a sample!) Americas 27
Eurasia 23 Australia/PNG 18 Austronesia 15
Africa 14 Creoles 2 Artificial 1

64
Languages currently sampled
65
Lexical items transcription
Second phase of project (2008) Problems with
full IPA representation solved

66
Lexical items transcription
Second phase of project (2008) Problems with
full IPA representation solved 1. automatic
conversion IPA to integer (Python)

67
Lexical items transcription
Second phase of project (2008) Problems with
full IPA representation solved 1. automatic
conversion IPA to integer (Python) 2.
(semi-)automatic recoding to ASJPcode
transduction on the basis of a formal grammar

68
Lexical items transcription
Abaza (Caucasian) Meaning PERSON

69
Lexical items transcription
Abaza (Caucasian) Meaning PERSON IPA ????'?
????s

70
Lexical items transcription
Abaza (Caucasian) Meaning PERSON IPA ????'?
????s Decimal 661 695 616 679 700 690 695 661
695 616 115

71
Lexical items transcription
Abaza (Caucasian) Meaning PERSON IPA ????'?
????s Decimal 661 695 616 679 700 690 695 661
695 616 115 ASJPcode 88 119 126 51 67 34 121
119 126 88 119 126 51 115 ( Xw3Cw"yXw3s)

72
Lexical items transcription
Second phase of project (2008) 1. automatic
conversion IPA to integer (Python) 2.
(semi-)automatic recoding to ASJPcode
transduction on the basis of a formal grammar
Why not run on full IPA??

73
Lexical items transcription
Second phase of project (2008) 1. automatic
conversion IPA to integer (Python) 2.
(semi-)automatic recoding to ASJPcode
transduction on the basis of a formal grammar
- correlations IPA ASJP gt 0.9

74
Lexical items transcription
Second phase of project (2008) 1. automatic
conversion IPA to integer (Python) 2.
(semi-)automatic recoding to ASJPcode
transduction on the basis of a formal grammar
- correlations IPA ASJP gt 0.9 - but ASJP
better fit with classifications ?
IPA too specific

75
Lexical items transcription
IPA ????'?????s Decimal 661 695 616 679 700
690 695 661 695 616 115 ASJPcode (
any unicode subset )
A ? n661, n695, n616, P Q ? A B C Z ? P Q Z
formal grammar

76
Lexical items transcription
IPA ????'?????s Decimal 661 695 616 679 700
690 695 661 695 616 115 ASJPcode (
any unicode subset )
optimal level of abstraction for
historical phonological reconstruction?
A ? n661, n695, n616, P Q ? A B C Z ? P Q Z

77
2. Comparing languages
78
Comparing words

79
Comparing words

LDi3
80
Comparing words

LDi3
LDj4
81
Comparing words

LDk3
LDi3
LDj4
82
Comparing words


LDk3
LDi3
LDj4
83
Comparing words


LDi3
LDj4
LDk3
LDmean3.73
84
Comparing words


LDi4
LDj4
LDk4
LDmean4.37
85
Comparing words
3.73

86
Comparing words
3.73
4.37

87
Comparing words
Levenshtein Distance

88
Comparing words
Levenshtein Distance a. between 2
words Number of transformations to get from the
shorter form to the longer one (changes,
additions)

89
Comparing words
Levenshtein Distance a. between 2
words Number of transformations to get from the
shorter form to the longer one (changes,
additions) b. Between 2 languages E.g. mean LD
for overlapping set (lt 40)

90
Comparing words
Levenshtein Distance Two problems with simple
LD

91
Comparing words
  • Levenshtein Distance
  • Two problems
  • Value depends on length of longest word


92
Comparing words
  • Levenshtein Distance
  • Two problems
  • Value depends on length of longest word
  • ? Normalize LDN ( LD / Lmax )


93
Comparing words
  • Levenshtein Distance
  • Two problems
  • Value depends on length of longest word
  • ? Normalize LDN ( LD / Lmax )
  • 2. Differences between lgs in phonological overlap


94
Comparing words
  • Levenshtein Distance
  • Two problems
  • Value depends on length of longest word
  • ? Normalize LDN ( LD / Lmax )
  • 2. Differences between lgs in phonological
    overlap
  • ? Eliminate noise LDND ( LDN / LDNdifferent )


95
Comparing languages
  • Levenshtein Distance for Language Pair
  • Mean of all LDNDs of words in common


96
Comparing languages
  • Levenshtein Distance for Language Pair
  • Mean of all LDNDs of words in common
  • Synonyms (12)
  • take Minimum pair
  • take Mean


97
Comparing languages
  • Levenshtein Distance for Language Pair
  • Mean of all LDNDs of words in common
  • Synonyms (12)
  • take Minimum pair
  • take Mean

Experimental option

98
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"EyEr LDND55.0
ALT AGL c"ayif
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

99
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"EyEr LDND55.0
ALT AGL c"ayif
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

100
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"EyEr LDND55.0
ALT AGL c"ayif
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

101
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"EyEr LDND55.0
ALT AGL c"ayif
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

102
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"EyEr LDND55.0
ALT AGL c"ayif
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

103
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"EyEr LDND55.0
ALT AGL c"ayif
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

104
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"ayif LDND55.0
ALT AGL c"EyEr
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

105
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"ayif LDND55.0
ALT AGL c"EyEr
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

106
Comparing languages
AVAR (AVA NAKH-DAGHESTANIAN gt AVAR-ANDIC-TSEZIC)
/ AGUL (AGL NAKH-DAGHESTANIAN gt
LEZGIC) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0
ALT AGL ac"ar
NEW c"iyac"ayif LDND55.0
ALT AGL c"EyEr
COMMON (LDND lt 70) AGL - AVA 6 (15.8 of
38) LD 4.01 / LDN 81.76 / LDND 89.87

107
Comparing languages

108
3. Some results genetic and areal proximity
109
Distance Matrix (0.5 N (N-1))
lt Excel file gt
110
Tools for Trees

111
Tools for Trees
  • Run data using phylogenetic software such as
    SplitsTree (www.splitstree.org)


112
Tools for Trees
  • Run data using phylogenetic software such as
    SplitsTree (www.splitstree.org)
  • Choose the most appropriate algorithm (Neighbour
    Joining for distance data)


113
NeighborJoining
Salishan Languages (n30)
114
NeighborJoining
Salishan Languages (n30)
Existing Classifications
115
NeighborJoining
NeighborJoining
116
NeighborJoining
  • NeighborJoining
  • specifically meant for
  • phylogenetic trees

117
NeighborJoining
  • NeighborJoining
  • specifically meant for
  • phylogenetic trees
  • does NOT assume equal rate
  • of change

118
Calibration of Method
Calibration best options, parameters,
factors A. for pure classification

119
Calibration of Method
Calibration best options, parameters,
factors A. for pure classification - existing
classifications (Ethnologue WALS mainly the
well-documented areas)

120
Calibration of Method
  • Calibration best options, parameters, factors
  • A. for pure classification
  • - existing classifications (Ethnologue WALS
  • mainly the well-documented areas)
  • - expert knowledge of specific areas


121
Calibration of Method
  • Calibration best options, parameters, factors
  • A. for pure classification
  • - existing classifications (Ethnologue WALS
  • mainly the well-documented areas)
  • - expert knowledge of specific areas
  • ? diversion 12


122
Calibration of Method
  • Calibration best options, parameters, factors
  • A. for pure classification
  • - existing classifications (Ethnologue WALS
  • mainly the well-documented areas)
  • - expert knowledge of specific areas
  • ? diversion 12 ? if resistant niche!


123
Calibration of Method
Calibration best options, parameters,
factors B. for dating

124
Calibration of Method
Calibration best options, parameters,
factors B. for dating - linguistically
crucial historic events

125
Linguistically crucial events
Date Historical event
Linguistic event

126
Linguistically crucial events
Date Historical event
Linguistic event

127
Linguistically crucial events
Date Historical event
Linguistic event

128
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • ? Standard formula (Swadesh)
  • TimeDepth log(Similarity) / 2 log
    Retention


129
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • ? Standard formula
  • TimeDepth log(Similarity) / 2 log
    Retention


130
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • ? Standard formula
  • TimeDepth log(LDND) / 2 log Retention


131
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • ? Standard formula
  • TimeDepth log(LDND) / 2 log Retention


132
Linguistically crucial events

133
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • - Standard formula
  • TimeDepth log(LDND) / 2 log 73


134
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • - Standard formula
  • TimeDepth log(LDND) / 2 log 73 lt 75


135
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • - Standard formula
  • TimeDepth log(LDND) / 2 log 73 lt 75 lt 85


136
Calibration of Method
  • Calibration best options, parameters, factors
  • B. for dating
  • - linguistically crucial historic events
  • - Standard formula
  • TimeDepth log(LDND) / 2 log 73 lt 75


Deeper!
137
Glottochronology only?
Calibration of method Glottochronology all
based on lexical distance

138
Glottochronology only?
Calibration of method Glottochronology all
based on lexical distance Add other linguistic
domains

139
Glottochronology only?
Calibration of method Glottochronology all
based on lexical distance Add other linguistic
domains WALS Typological
database

140
Glottochronology only?
Calibration of method Glottochronology all
based on lexical distance Add other linguistic
domains WALS Typological
database Best result (75 40 lex) (25 40
Ph/M/S features)

141
4. On Inheritance vs Borrowing
142
Inherited or borrowed?
AVAR (AVA) / AGUL (AGL)

143
Inherited or borrowed?
AVAR (AVA) / AGUL (AGL) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0 NEW
c"iyac"EyEr LDND55.0

144
Inherited or borrowed?
AVAR (AVA) / AGUL (AGL) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0 NEW
c"iyac"EyEr LDND55.0 ? 6 items lt
70.0

145
Inherited or borrowed?
AVAR (AVA) / AGUL (AGL) I dunzun
LDND36.6 YOU munwun
LDND36.6 HORN tLark"arC
LDND66.0 FIRE c"ac"a LDND
0.0 FULL c"uraac"uf LDND66.0 NEW
c"iyac"EyEr LDND55.0 ? 6 items lt
70.0 ? Genetically related !!

146
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA)

147
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2

148
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 ? 6 items lt 70.0

149
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 ? 6 items lt 70.0
RELATED ???

150
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 ? RELATED ???
NO!!!

151
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 INDO-EUROPEAN lt gt
AUSTRONESIAN

152
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 CHANCE?

153
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 CHANCE? ? lt 5
(i.e. 1 2 items)

154
Inherited or borrowed?
SPANISH (SPA) / CHAMORRO (CHA) ONE
unounu LDND36.9 TWO
dosdos LDND 0.0 PERSON
personapetsona LDND55.3 STAR
estreyaestrecas LDND61.2 NIGHT
noCenoces LDND68.2 NEW
nuevonueba LDND44.2 BORROWING through
LANGUAGE CONTACT

155
Inherited or borrowed?
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9

156
Inherited or borrowed?
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9 SPA lt?gt CHA

157
Inherited or borrowed?
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9 SPA lt?gt CHA fam/gen
0.24/0.82

158
Inherited or borrowed?
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9 SPA lt?gt CHA fam/gen
0.24/0.82 gt 0.03/0.00

159
Inherited or borrowed?
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9 SPA lt?gt CHA fam/gen
0.24/0.82 gt 0.03/0.00
phon pattern fit 12.00 gt 0.67

160
Inherited or borrowed?
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9 SPA lt?gt CHA fam/gen
0.24/0.82 gt 0.03/0.00
phon pattern fit 12.00 gt 0.67
gt gt

161
Borrowed!
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROONE unounu
LDND36.9 SPA gt CHA fam/gen
0.24/0.82 gt 0.03/0.00
phon pattern fit 12.00 gt 0.67


162
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROTWO dosdos LDND
0.0 SPA gt CHA f/g 0.62/1.00 gt
0.12/0.00 swF 100.00
gt 0.22


163
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROSTAR estreyaestrecas
LDND61.2 SPA gt CHA f/g 0.17/0.82 gt
0.00/0.00 swF 100.00 gt 4.44


164
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORRONIGHT noCenoces
LDND68.2 SPA gt CHA f/g 0.23/0.55 gt
0.04/0.00 swF 100.00 gt 0.10


165
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORRONEW nuevonueba
LDND44.2 SPA gt CHA f/g 0.50/0.64 gt
0.04/0.00 swF 4.27 gt 0.03

166
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROPERSON personapetsona
LDND55.3 SPA gt CHA f/g 0.20/0.64 gt
0.01/0.00 swF 32.40 gt 0.13

167
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROPERSON personapetsona
LDND55.3 SPA gt CHA f/g 0.20/0.64 gt
0.01/0.00 swF 32.40 gt 0.13 ALT CHA
taotao (0.41/0.00)

168
Borrowing
SPANISH (SPA) INDO-EUROPEAN (128) gt ROMANCE
/ CHAMORRO (CHA) AUSTRONESIAN (310) gt
CHAMORROPERSON personapetsona
LDND55.3 SPA gt CHA f/g 0.20/0.64 gt
0.01/0.00 swF 32.40 gt 0.13 ALT CHA
taotao (0.41/0.00)

169
5. Conclusions
170
Conclusions
- Method for automatic reconstruction of language
relationships, using mass comparison of
lexical and typological data

171
Conclusions
- Method for automatic reconstruction of language
relationships - Framework to discuss and correct
existing classifications

172
Conclusions
- Method for automatic reconstruction of language
relationships - Framework to discuss and correct
existing classifications - Test hypotheses about
genetic distances in time

173
Conclusions
- Method for automatic reconstruction of language
relationships - Framework to discuss and correct
existing classifications - Test hypotheses about
genetic distances in time - Locate (and
eliminate) potential borrowings

174
Conclusions
- Method for automatic reconstruction of language
relationships - Framework to discuss and correct
existing classifications - Test hypotheses about
genetic distances in time - Locate (and
eliminate) potential borrowings - C O R E
incremental lexical database (gt 35)

175
Conclusions
- Method for automatic reconstruction of language
relationships - Framework to discuss and correct
existing classifications - Test hypotheses about
genetic distances in time - Locate (and
eliminate) potential borrowings - C O R E
incremental lexical database (gt 35) ? One day
soon Online

176
Conclusions
- Method for automatic reconstruction of language
relationships - Framework to discuss and correct
existing classifications - Test hypotheses about
genetic distances in time - Locate (and
eliminate) potential borrowings - C O R E
incremental lexical database (gt 35) ? One day
soon Online ? Join and cooperate!!!
177
Holman et al. (forthc. 2008) Explorations in
automated language classification. Folia
Linguistica Brown et al. (forthc. 2008)
Automated Classification of the Worlds
languages A description of the method and
prelimary results Sprachtypologie und
Universalienforschung Several working
papers email.eva.mpg.de./wichmann/ASJPHomePage
178
?
Write a Comment
User Comments (0)
About PowerShow.com