Title: Structures and strings
1Structures and strings
- Virginia Savova
- Johns Hopkins University
2Thank you
- My adviser Robert Frank.
- My committee members
- Paul Smolensky
- Fred Jelinek
- Sanjeev Khudanpur
- Robert Berwick (here all the way from Boston)
- All of you (for being here so early)
3Structures and strings
- What should structure tell us about ordered word
sequences? - grammaticality status
- interpretation
- agreement relations
- word order
Separate linearization component
4Dimensions of structure
interpretation agreement relations
Dominance
grammaticality
Precedence
word order
5Strict correspondence
Asymmetric c-command precedence Kayne, 1994
6Why SC is a good idea
- Apparent parsimony
- No need for an external linearization component
- Strong claim
- dominance is all that matters, precedence is
epiphenomenal - cross-linguistic variation is confined to the
dominance relation
7Why SC is a bad idea
- Gotta Move
- Difference in precedence ? difference in
dominance (even if no independent evidence
exists) - Can derive any word order unless movement is
required to have other consequences. - Non-canonical word order structural complexity
8Troubling consequences
- Some languages are structurally more complex than
others at spell-out - Latin requires movement in simple clauses
9 Troubling consequences
- Mapping at LF must be language-specific
LOVED (Brutus, Caesar)
10 Troubling consequences
- Tree-sequence representation with spell-out at
different levels
Spell-out English
Spell-out Latin
LF
11What is the alternative?
- Let structure represent only dominance.
- Determine word order through a linearization
procedure on the basis of - structure (i.e. dominance)
- discourse
- morpho-phonology
12Why is that a bad idea
- Must specify an external linearization component
different for each language - where is UG?
- Cross-linguistic variation is no longer confined
to the dominance relation - Can derive any word order
13Why is that a good idea
- Can simplify our notion of structure
- discourse features can be distinguished from
syntactic feature (no Top, Foc heads) - word order difference need not imply structural
difference - precedence need not be defined on structures
- tree sequences are no longer required
- Cross-linguistic variation can be confined to the
lexicon the precedence relation
14Resolving the negative side
- I will specify an external linearization
component which - accounts for cross-linguistic variation in word
order - is restricted w.r.t. possible word orders
- (putting UG back into linearization)
15Exploiting the positive side
- I will err on the side of structural simplicity
by - defining a hierarchical component based on
Dependency Grammars - dispensing with the notion of sequence-type
structures - externalizing discourse features
16Dependency structure(Tesniere 1959, Hays 1964,
Gaifman 1965)
- Simple structural representation
- a set of binary asymmetric relations among
sentential elements . - Preliminary conditions
- Every word is dependent (modifies, is conditioned
upon, is selected by) some other word. - Exactly one word is (dependent on) ROOT.
17Constituent versus Dependency Structures
Head of VP
Head of NP
18Constituent versus Dependency Structures
19Conditions on Dependency Structure
- Widely-accepted formal conditions
- Every word ? exactly one other word (head).
- Exactly one word ? ROOT.
- No cycles
- Projectivity no crossing arrows
Dependency structures are trees
Hudsons Adjacency principle A word can be
separated from its head only by a sister and
subordinates
20 Conditions on Constituent Structure
- Every node has a single mother
- Only one root
- No cycles
- No crossing branches
21Word order in dependency structure
- Head-Dependent ordering constraints
- Spec Head Comp (English)
- Spec Comp Head (Latin)
- Linearization procedure
- bought
- George bought camel
- George bought a camel
bought
comp
spec
George
camel
spec
a
22Word order in Constituent Structure
- Head Comp
- Spec Head
- Spec Comp
- Kaynes Antisymmetry These fall out of dominance
- PP These are binary parameters
23Dependency grammar
- Lexicon (ROOT, buy, John, the, camel, sleep)
- Syntactic categories (PoS)
- (NNSG,NPL, NPROP VVTR, VINTR D, P, Adv, Adj,
ROOT) - Subcategorization frames for PoS
- Subcat frame ordered list of obligatory deps
(typearguments) - VTR 1N, 2N VINTR 1N NSG 1D
NPROP ROOT1V - Optional modifiers for PoS
- V Adv, P N Adj, P
24Dependency (tree) structure Generation
- A possible generative procedure
- Insert ROOT
- Recursively
- For each unmarked node in the structure, insert
dependents that satisfy some subcat frame for
that node and mark it. - For each marked node, insert a finite number of
allowed modifiers - For each PoS label, insert a lexical entry
25Dependency structures are not trees
- Subject is dependent on both auxiliary and verb
- Heads compete for adjacency
George doesnt buy camels
26Dependency structures are not projective
- Discourse-related word order variation
- Locality competes with discourse
Projectivity doesnt hold (arrows cross)
27The structural sequence solution
- One solution structure is a graph-tree sequence
(Hudson, 1995) - subcategorization ( modification) relations
- morphosyntactic feature-driven relations
- trees related through transformation
preserve M links remove conflicting S links
Linearize
George doesnt buy camels
28The conflict and competition solution
- Evidence of competition in word order
- locality against discourse-motivated order
- multiple heads against one another
- Conflicts are universal
- Resolutions are language-specific
Optimality Theory
29Separating structures from strings
- Generalized acyclic graphs
- Linearization optimization over conflicting
constraints
Other factors
30Word Order Optimization
- Underlying representation
- structure
- discourse features
- Candidate surface representations
- all possible permutations of elements
- Constraints
- Local (Head-Dep) ordering
- Discourse constraints
31Conditions on Dependency Structure
- New (less-restrictive) formal conditions
- Every word ? exactly one other word (head).
- Exactly one word ? ROOT.
- No cycles
- Projectivity no crossing arrows
Dependency structures are no longer trees but
directed acyclic graphs
32Dependency structure II Generation
- A possible generative procedure
- Insert ROOT
- Recursively
- For each unmarked node in the structure, insert
dependents that satisfy some subcat frame for
that node and mark it. If any such dependent is
already appropriately inserted, create a
secondary dependent - For each marked node, insert a finite number of
allowed modifiers - For each PoS label, insert a lexical entry
ROOT
VAUX
VTR
NPL
NPROP
Adv
D
33Word order in compound tense clauses
preserve M links remove conflicting S links
Linearize
George doesnt buy camels
34Typology of simple-tense word order
- Four basic word orders are uncontroversial
- SOV
- SVO
- VSO
- VOS
- Find a set of four constraints such that each
word order violates at most one
35(No Transcript)
36Cognitive basis of constraints
- S O
- V O
- Adjacent(O,V)
- Adjacent(V, Edge)
Agent Patient Compute function inside
out Consistency, boundary marking
Tomlin 1986
37Generalized constraints
HdCmp The head must precede its complement
Observed in SVO, VSO, VOS SpcCmp The
specifier must precede the complement Observed
in SOV, SVO, VSO
HdCmp The head must be adjacent to its
complement Observed in SOV, SVO, VOS HdEdge
The head must be at the edge Observed in SOV,
VSO, VOS
38(No Transcript)
39Word order of compund subclauses
- Subordinate clause structure
- complementizer that
- dependency between subject and auxiliary
- subject is specifier of Aux and V
- V complement of Aux
comp
spec
comp
spec
40Word order of compund subclauses
- Candidate set
- 5! 120 permutations
- Four constraints
- 6 possible winners
- 114 permutations are universally banned!
- (inferior under any strict ranking)
that
41Word order of compound subclauses
5! 120 permutations 6 possible winners
42Word order of main clauses with auxiliaries
- Must derive one additional word order
- SAuxOV
- German
- SVO/ SAuxOV/SOV/SOVAux
- Dinka
- SVO/SAuxOV/VSO/AuxSOV
- harmonically bounded by SOVAux, AuxSOV and SAuxVO
43Word order of main clauses with auxiliaries
- Head Movement Empty head (C) must merge with its
complement - Consequence Long Head Movement is banned!
- One additional constraint
- cEdge
- (Empty) C should not be aligned with the edge of
the utterance
C
44Word order of main clauses with auxiliaries
45Predicted cross-linguistic word order typology
4 unattested types
46Rare word orders
47Rare word orders
- What is the role of frequency in linguistic
theory? - rare languages might be an experimental error
- different frequencies might be due to historical
accident - might be due to the relative fitness of a
language type
48Constraints as soft biases in iterative learning
Kirby 2001
learner1
learner2
hypothesis
hypothesis
data
data
Griffiths and Kalish 2005
- Iterated Bayesian learning
- finite discrete hypothesis space
- prior over hypotheses
- speakers generate samples from hypothesis
49Prior of satisfied constraints
Equal weight SOV 3 22 SVO 3 22 VSO
3 22 VOS 3 22 OVS 1 6 OSV 1 6
- HdCmp
- (VO)
- HdEdge
- (...V or V...)
- HdCmp
- (VO)
- SpcCmp
- (SO)
50Consequences for frequency
- In the limit, the distribution of language types
converges to prior - Before convergence
- 70 of the time after 15 iterations, the
distribution of language types features the four
high prior languages outnumbering the low prior
languages - the frequencies of high-prior languages differ
- ? the real world distribution is an instance of
such distribution
51Typical pre-convergence distribution
52Summary of this talk
- Structure should be order-free.
- How can word order variation and universal
tendencies be captured outside of the structural
component? - An account of cross-linguistic variation of basic
word order - simple-tense clauses
- complex-tense clauses
- subordinate clauses
- main clauses
- A possible link to frequency
53Summary of things not in this talk
- An account of within language word order
variation - questions versus declaratives
- inversion in yes-no questions
- wh-fronting
- clitic versus non-clitic pronouns
- restrictions on collocations
54The big picture
- Cross-linguistic variation
- is surprising if language is evolutionarily
selected for/innate - can be explained if individual languages are
viewed as different solutions to an optimization
problem - Word order variation is due to multiple solutions
of dependency structure to string mapping under
cognitive constraints