Title: Arabic Dialect Syntax and Tree Adjoining Grammar
1Arabic Dialect Syntax andTree Adjoining Grammar
- Owen Rambow
- Columbia University
- rambow_at_cs.columbia.edu
2Overview
- Morphology and Syntax
- Phrase Structure for MSA
- Dialect Syntax
- Parsing Dialect Syntax
- Tree Adjoining Grammar
3Morphology and Syntax
- Rich morphology crosses into syntax
- Pro-drop / Subject conjugation
- Verb subcategorization and subject/object clitics
- Verbtransitivesubjectobject
- Verbintransitivesubject but not
Verbintransitivesubjectobject - Verbtransitivepassivesubject but not
Verbtransitivepassivesubjectobject - Verbintransitivepassive but not
Verbintransitivepassivesubject
4Morphology and Syntax
- Morphological interactions with syntax
- Agreement
- Full e.g. Noun-Adjective on number, gender, and
definiteness - Partial e.g. Verb-Subject on gender (in VSO
order) - Definiteness
- Noun compound formation, copular sentences, etc.
- NounsDefiniteArticle, Proper Nouns, Pronouns,
etc.
5Morphology and Syntax
- Morphological interactions with syntax
(continued) - Case
- MSA is case marking nominative, accusative,
genitive - Almost-free word order
- Case is often marked with optionally written
short vowels - This effectively limits the word-order freedom in
published text - Agglutination
- Attached prepositions create words that cross
phrase boundaries - ????????? liAlmaktabat
- for the-libraries PP li NP Almaktabat
- Some morphological analysis (minimally
segmentation) is necessary even for statistical
approaches to parsing
6Sentence Structure
- Traditional Arabic Grammar Two types of Arabic
Sentences - Verbal sentences
- Verb Subject Object (VSO)
- ??? ??????? ???????Wrote the-boys the-poemsThe
boys wrote the poems - Copular sentences
- Topic Complement
- ??????? ?????the-boys poetsThe boys are poets
7Verbal Sentences
- Verb agreement in VSO with gender only
- ??? ?????\??????? wrote3MascSing the-boy/the-boys
- ???? ?????\?????? wrote3FemSing
the-girl/the-girls - Pronominal subjects are cliticized
- ????? wrote-youMascSing
- ????? wrote-youMascPlur
- ????? wrote-theyMascPlur
8VSO vs SVO vs OVS
- ???? ?????????????wrote.fem the-girls
the-poemsThe girls wrote the poems - ???????????wrote-they.fem the-poemsThey.fem
wrote the poems - ?????? ???? ???????the-girls wrote-they.fem
the-poemsThe girls wrote the poems - ??????? ?????? ?????? the-poems wrote.fem-them
the girlsThe poems, the girls wrote them
9VSO, VOS, SVO, OVSDescriptive Generalization
- VSO or VOS Agreement with subject in gender only
- Subject pronoun is a clitic on verb and replaces
agreement - SVO order has preposed subject followed by verb
with subject clitic - Object pronoun is a clitic on verb (does not
replace subject agreement) - OVS order has preposed object followed by verb
with object clitic
10VSO Phrase Structure
S
NP
VP
the boys
V
NP
the poems
wrote
11VSO Phrase Structure
Penn Arabic Treebank
S
VP
NP
V
NP
the poems
wrote
the boys
12VSO Phrase Structure
S
NP
VP
the boys
V
NP
the poems
wrote
13VSO Phrase Structure
S?
S
V
wrote
NP
VP
the boys
V
NP
?
the poems
14VSO Phrase Structure
S?
S
V
Head Movement
wrote
NP
VP
the boys
V
NP
?
the poems
15SVO Phrase Structure
- Option 1 English phrase structure
S
Problem Arabic does not look like English
(subject clitic on verb)
NP
VP
the boys
V
NP
the poems
wrote
16SVO Phrase Structure
S
VP
NP
V
NP
the poems
wrote
the boys
17SVO Phrase Structure
Penn Arabic Treebank
S
NP
VP
the boys
V
NP
NP
the poems
wrote
?
18SVO Phrase Structure
S?
NP
S?
the boys
S
V
wrote
NP
VP
?
V
NP
the poems
?
19Copular sentences
- Topic Complement
- Definite Topic, Indefinite Complement
- ????? ????the-boy poetThe boy is a poet
- Auxiliary Topic Complement
- Auxiliaries (kana and her sisters)
- Tense, Negation, Transformation, Persistence
- ??? ????? ????? was the-boy poet The boy was a
poet - ??? ????? ????? is-not the-boy poet The boy is
not a poet
20Copular Sentences
- Types of complements
- Noun/Adjective/Adverb
- ????? ??? the-boy smart The boy is smart
- Prepositional Phrase
- ????? ?? ??????? the-boy in the-library The boy
is in the library
21SVO, OVS as Copular Sentence
- Verb-Sentence
- ??????? ????? ???????
- the-boys wrote-they poems The boys wrote
the poems - ??????? ?????? ??????
- the-poems wrote.fem-them the girls The
poems, the girls wrote them - Copular-Sentence
- ????? ????? ???? the-boy book-his big The
boy, his book is big
22Common Structural Ambiguities
- Third masculine/feminine singular are
structurally ambiguous - Verb3MascSingular NounMasc
- Verb subjecthe objectNoun
- Verb subjectNoun
- Passive and active forms are often similar in
standard orthography - ??? /kataba/ he wrote
- ???? /kutiba/ it was written
23Overview
- Morphology and Syntax
- Phrase Structure for MSA
- Dialect Syntax
- Parsing Dialect Syntax
- Tree Adjoining Grammar
24Dialect Syntax
- MSA
- Verb Subject Object ??? ??????? ???????
- wrote.masc the-boys the-poems
- Subject Verb Object (Full agreement)
- ??????? ????? ???????
- the-boys wrote-they.masc the-poems
- LEV, EGY
- Subject Verb Object
- ??????? ???? ???????
- The-boys wrote.masc.pll the-poems
- Less frequent Verb Subject Object
- ???? ??????? ???????
- wrote.masc.pl the-boys the-poems
- Full agreement (or clitic?) in both orders
25Dialect Syntax Noun Phrases
- Possessives
- Idafa construction
- Noun1 of Noun2 encoded structurally
- ??? ??????
- king Jordan
- the king of Jordan / Jordans king
- Dialects have an additional common construct
- Noun1 ltparticlegt Noun2
- LEV ????? ??? ?????? the-king belonging-to
Jordan - ltparticlegt differs widely among dialects
- Pre/post-modifying demonstrative article
- MSA ??? ????? this the-man this man
- EGY ?????? ?? the-man this this man
-
26Code Switching
- MSA and Dialect mixing in speech
- phonology, morphology and syntax
MSA LEV
-
- ?? ??? ?? ????? ???? ????? ???? ?? ???????? ?????
????? ?????? ???? ?? ???? ?????? ???????? ??????
??????? ???????? ????? ??? ????? ????? ??? ?????
??? ????? ??? ???? ?? ???? ????????? ?????? ????
???? ?? ?????? ????? ??????????? ??? ???? ??
?????? ????????? ?????? ??? ???? ?? ????? ??
?????? ????? ?? ????? ???? ??? ???????? ?? ???
???? ???? ??? ????? ??????? ????? ???? ??? ????
?? ??????? ????? ??? ?? ?????? ?? ????? ????
????? ?????? ?? ????? ?? ??? ?????? ??? ????
????? ???????? ?????? ?? ????? ??? ???????
?????? ??????? ???? ???? ???? ??????? ???????
???? ??? ????? ?? ??? ????? ?? ???? ???? ???? ???
??? ??????? ????? ???????? ?? ????? ????????? ???
????? ????? ????? ??? ???? ?????? ???? ????? ??
??? ????? ???? ?? ????? ?? ???? ??????? ?? ????
???? ?????? ????????? ???? ??? ??? ?? ????? ??
??? ????? ?????? ???? ?????? ????????? ????
??????? ???? ????? ????????? ???? ????? ?? ?? ???
??? ?? ?? ???? ????? ???? ??????? ??????? ?? ???
?? ?????? ????? ?? ??? ?? ????? ?? ??? ??????
???????? ?? ????? ????? ????? ??? ????? ?? ????
?????? ???? ?????? ????? ??? ???? ???? ????? ???
????? ????? ???? ?? ????? ???? ???? ????? ???
?????? ???? ??????? ???? ??? ???? ???? ??????
????? ????????? ???????? ??? ?????? ???? ????
??????? ???? ??????? ??? ?????? ???? ??? ????? ??
??? ???????? ??? ??????? ?????????? ??? ?????
????? ??? ??????? ????? ?? ?? ???? ???? ???
??????? ?? ?????? ?? ?? ??????? ??? ????? ??????
???????? ??? ?????? ???????? ??? ?? ????? ?????
??????? ?????? ????? ?? ??? ???? ?? ????
??????????? ??? ?????? ???? ?????? ?? ??? ???????.
Aljazeera Transcript http//www.aljazeera.net/prog
rams/op_direction/articles/2004/7/7-23-1.htm
27Parsing Arabic DialectsProblem
Dialect
MSA
Treebank
??????? ???? ???????
?
????
Parser
???????
???????
28Parsing Arabic Dialects
- Many different dialects
- Dialects are spoken, few written resources
- Code switching
- Conclusion Cant assume we will get treebanks
for each dialect - What to do?
29Parsing Arabic Dialects
- Idea use resources for MSA, apply them to
dialects - We will be investigating three approaches
30Parsing Arabic DialectsProposed Solution 1
Dialect
MSA
Treebank
??????? ???? ???????
??? ??????? ???????
Parser
31Parsing Arabic DialectsProposed Solution 2
Dialect
MSA
Treebank
??????? ???? ???????
Parser
32Parsing Arabic DialectsProposed Solution 3
Dialect
MSA
Treebank
??????? ???? ???????
Parser
33Overview
- Morphology and Syntax
- Phrase Structure for MSA
- Dialect Syntax
- Parsing Dialect Syntax
- Tree Adjoining Grammar
34(No Transcript)