Formal Typology: Explanation in Optimality Theory - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

Formal Typology: Explanation in Optimality Theory

Description:

2. How does this system of knowledge arise in the mind/brain? ... Rules representational/notational tricks. Rules constraints. E.g., = NOCODA ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 77
Provided by: paulsmo
Category:

less

Transcript and Presenter's Notes

Title: Formal Typology: Explanation in Optimality Theory


1
Formal Typology Explanation in Optimality Theory
  • Paul Smolensky
  • Cognitive Science Department
  • Johns Hopkins University

with
Géraldine Legendre Donald Mathis Melanie
Soderstrom
Alan Prince Suzanne Stevenson Peter Jusczyk
2
Advertisement
The Harmonic Mind From neural computation to
optimality-theoretic grammar   Paul
Smolensky   Géraldine Legendre
  • Blackwell 2002 (??)
  • Develop the Integrated Connectionist/Symbolic
    (ICS) Cognitive Architecture
  • Apply to the theory of grammar

3
Chomsky 1988
  • 1. What is the system of knowledge?
  • 2. How does this system of knowledge arise in
    the mind/brain?
  • 3. How is this knowledge put to use?
  • 4. What are the physical mechanisms that serve
    as the material basis for this system of
    knowledge and for the use of this knowledge? (p.
    3)

4
Responsibilities of Grammatical Theory
Chomskys Big 4 questions concerning knowledge
of grammar
OT
Structure
Nativist hypothesis
Acquisition
Processing
Neuro-genetics
Not new to Chomsky or generative grammar
5
Jakobsons Program
  • Linguistic theory is not just for theoretical
    linguists
  • The same principles that explain formal
    cross-linguistic and language-internal
    distributional patterns can also explain
  • Acquisition
  • Processing
  • Neurological breakdown

6
Jakobsons Program
  • Markedness enables a Grand Unified Theory for the
    cognitive science of language Avoid a
  • ? Structure
  • Inventories lack a
  • Alternations eliminate a
  • ? Acquisition
  • a is acquired late
  • ? Processing
  • a is processed poorly
  • ? Neural
  • Brain damage most easily disrupts a

7
Talk Plan
OT Explanation
  • ? Structure
  • ? Acquisition
  • ? Processing
  • ? Neuro- genetics

?
8
Responsibilities of Grammatical Theory
Chomskys Big 4 questions concerning knowledge
of grammar
Structure
Structure of UG Captured in a general formalism
for grammars and their variation
Acquisition
Processing
Neuro-genetics
9
From Markedness to OT
  • Formalizing markedness ? ? ? OT
  • Markedness constraints
  • Faithfulness constraints
  • Competition
  • Strict domination
  • Strong universality Richness of the Base

10
? Structure Formal ResultFormalizing
Markedness Two Problems
  • Goal Change epiphenomenal explanatory status of
    markedness
  • Markedness explains grammars (e.g., rules)
    informal commentary about grammar vs.
  • Markedness IS grammar markedness-grammars
    formally determine languages

?
11
? Structure Formal ResultFormalizing
Markedness Two Problems
  • Problem 1 Multidimensional integration
  • Each dimension of linguistic structure
    independently has its own marked pole, but how do
    these dimensions combine?
  • Turns out to be related to another fundamental
    problem

12
? Structure Formal ResultFormalizing
Markedness Two Problems
  • a is marked ? Avoid a
  • But when how does avoidance happen? Problem
    2 Pervasive variability in avoidance
  • Inventories If ? is absent in French because
    it is marked how can it be present in English
    despite being marked?
  • The grammar of every language turns on or off
    No a a a markedness constraint. OT More
    subtle version that also solves
  • Alternations If in environment E, a ? ß because
    a is more marked than ß, how do we explain that
    in E? a ?? ß even though a is more marked than
    ß?

13
? Structure Formal ResultFormalizing Markedness
  • Most crudely Why arent unmarked elements always
    avoided?
  • Something must oppose markedness forces.
  • Markedness cannot be the sole basis of a formal
    grammatical theory it is only one half of the
    complete story.

14
? Structure Formal ResultThe Great Dialectic
  • Phonological representations serve two masters

FAITHFULNESS
MARKEDNESS
Locked in eternal conflict
15
? Structure Formal ResultThe Core Constraints
of Con
  • MARKEDNESS a (minimize effort maximize
    distinctiveness)
  • constraint a ? Con ? a meets empirical
    criteria for marked
  • Freedom? Empirically constrained by universal
    patterns
  • FAITHFULNESS (be this invariant form)
  • /input/ ? output is the identity map, i.e.,
  • elements /x/ and x are in one-to-one
    correspondence and identical (McCarthy Prince
    95)
  • Constraints MAX(x), DEP(x), IDENT(x),
  • Essentially determined by elements x of
    representation
  • Freedom? Representations as always empirically
    constrained to allow statement of markedness
    constraints

In OT you can invent any constraint you want
?
16
? Structure Formal ResultConflict
  • Dialectic MARK vs. FAITH conflict
  • Why arent marked elements always avoided?
  • Because sometimes MARK is over-ruled by FAITH
  • Why arent words always pronounced in their
    invariant, lexical form?
  • Because sometimes FAITH is over-ruled by MARK
  • ?1 over-rules (dominates) ?2 ?1 ?2
  • Whether M gets violated (whether marked elements
    fail to be avoided) varies by
  • Language (in some, M F in others, F M)
  • Context (in some, M F2 in others F1 M)

17
? Structure Formal ResultConflict
  • Dialectic MARK vs. FAITH conflict
  • Whether M gets violated (whether marked elements
    fail to be avoided) varies by
  • Language (in some, M F in others, F M)
  • Context (in some, M F2 in others F1 M)
  • Why is there cross-linguistic variation?
  • Phonetic ? Lexical MARK ? FAITH Dialectic gets
    resolved differently
  • Typology by re-ranking Factorial Typology
  • possible human languages ? rankings of Con
  • (n constraints give n! rankings many are
    equivalent)

18
? Structure Formal ResultFormalizing Markedness
  • Problem 1 Avoidance of the marked is
    pervasively variable exactly where does marked
    material appear?
  • Solution Constraint ranking MARK w.r.t.
    FAITH
  • Will now see this also solves
  • Problem 2 Multidimensional markedness
  • Solution single constraint ranking for all
    constraints in a given language

19
? Structure Formal ResultFormalizing Markedness
  • Markedness is multidimensional
  • Each dimension has its universally marked pole
  • How do dimensions combine? (?M1, M2) vs. (M1,
    ?M2)
  • CVC.CV (?STRESSHEAVY, MAINSTRESSRIGHT) vs.
    CVC.CV
  • Integrate via a common markedness currency
    Harmony
  • Numerical M1 ?3.2 M2 ?2.8
  • Symbolic M1 absolutely worse than M2
  • OT
  • For a given language, there is a single
    constraint ranking for all constraints
  • Strict domination hierarchy markedness on
    higher-ranked constraints can never be
    compensated for by unmarkedness on lower-ranked
    ones

20
? Structure Formal ResultCompetition for
Optimality
  • Given an input, an OT grammar does not provide a
    procedure for how to construct the output bur
    rather a description of the output the structure
    that best-satisfies the constraint ranking
  • Best-satisfies is a comparative criterion
    outputs compete and the grammar identifies the
    winner the optimal grammatical highest
    Harmony output for that input

21
? Structure Formal ResultHarmonic Competition
  • Numerical Harmony
  • Stress is on the initial heavy syllable iff the
    number of light syllables n obeys

Pathological grammars
  • Grammars cant count

22
? Structure Formal ResultHarmonic Competition
  • Symbolic Harmony Strict domination
  • STRESSHEAVY MAINSTRESSRIGHT

Stress the initial heavy syllable
  • MAINSTRESSRIGHT STRESSHEAVY

Stress the final syllable
  • Strict domination ? Grammars cant count

23
? Structure Formal ResultOT Formal definition
  • Gen Specifies candidate outputs for any given
    input
  • Con The constraint set
  • A grammar A hierarchical ranking of Con
  • H-Eval Given two candidates and a ranking, a
    formal definition employing strict domination of
    which has higher Harmony which better-satisfies
    the ranking
  • I ? O mapping I ? The maximal-Harmony
    candidates in Gen(I)

24
? Structure Formal ResultRichness of the Base
  • Universality All systematic cross-linguistic
    variation arises from differences in constraint
    ranking
  • Therefore
  • Con is universal H-Eval is universal
  • Gen is universal, including the space of possible
    inputs as well as possible outputs
  • i.e. No systematic cross-linguistic variation is
    due to differences in inputs
  • e.g. Languages with no surface codas cannot get
    this property from limitations on the lexicon
    (e.g., a morpheme structure constraint Cwd)
    but rather from the ranking
  • i.e. The grammar must have the property that
    even if there were C-final inputs, there would
    still be no surface codas

25
Aside
  • Richness of the Base is a principle for inducing
    a grammar (generalizing) from a set of
    grammatical items
  • It can be justified by the central principle of
    John Goldsmiths presentation
  • ? Maximize the probability of the data

26
? Structure Conceptual QuestionExplanatory
Power
  • OT is as unexplanatory as extrinsically-ordered
    rule-theory
  • Stipulating ranking stipulating ordering

27
? Structure Conceptual QuestionAnalytic
Restrictiveness
  • You can make up any constraint you want in OT

28
? Structure Explanatory Goal Consequences of ?
? Con I The Subordination Pattern
  • E.g., ? NOCODA
  • Recall
  • If No codas is in UG, why do codas ever appear?
  • Conflict
  • With faithfulness constraints
  • With other markedness constraints other
    dimensions of markedness
  • Cross-linguistic variation codas are less and
    less restricted as NOCODA is subordinated to more
    and more conflicting constraints (i.e.,
    dimensions of markedness)

29
? Structure Empirical Application Subordination
Pattern Codas
NOCODA
No codas at all
Codas only in stressed syllables
Geminate codas
Codas unrestricted
except prohibited inter-vocalically V.CV
30
? Structure Conceptual QuestionMultiplicity
of Constraints
  • For second pervasive pattern generated by ? ?
    Con
  • Any framework which leads to the morass of
    constraints found in OT analyses in phonology
    cannot possibly be explanatorily adequate.

31
? Structure Explanatory Goal Consequences of ?
? Con II Factorial Interaction
  • Factorial interaction with varying interaction
    (re-ranking), n simple modular constraints
    correspond to
  • Multiplicity of rules (many more than n)
  • Complex, non-modular rules
  • Rules representational/notational tricks
  • Rules constraints
  • E.g., ? NOCODA

32
? Structure Empirical Application Factorial
Interaction Codas
  • Consider Con ? MAX ? MAX, DEP
  • Number of constraints increases by 1
  • Number of corresponding rules doubles as set of
    repairs now includes epenthesis as well as
    deletion
  • NOCODA MAX C?Ø/s
  • ? NOCODA DEP Ø ?V/Cs
  • ONSET MAX V?Ø/s
  • ? ONSET DEP Ø ?C/sV

33
? Structure Empirical Application Factorial
Interaction Codas
In general, the number of comparable rules
increases much faster than the number of
constraints
34
? Structure Explanatory Goal Consequences of ?
? Con II Factorial Interaction
  • Factorial interaction with varying interaction
    (re-ranking), n simple modular constraints
    correspond to
  • Multiplicity of rules (many more than n)
  • Complex, non-modular rules
  • Rules representational/notational tricks
  • Rules constraints
  • E.g., ? NOCODA

35
? Structure Empirical Application Factorial
Interaction Codas
  • STRESS-TO-WEIGHT NOCODA
  • Codas only in stressed syllables
  • C?Ø/s? segmental rule sensitive to foot
    structurenon-modular rules
  • ANCHOR-R NOCODA
  • Codas only word-finally
  • C?Ø/s plus final-C extrametricality
    representational trick
  • MAXµ NOCODA
  • Only geminate codas /Cµ/
  • C?Ø/s plus Hayes exclusivity of
    associationnotational trick

36
? Structure Empirical Application Factorial
Interaction
  • STRESS-TO-WEIGHT NOCODA Codas only in stressed
    syllables
  • STRESS-TO-WEIGHT Cµ
  • Geminates only after stressed V
  • µ?Ø/s?
  • ANCHOR-R NOCODA Codas only word-finally
  • ANCHOR-R voi,?son
  • Obstruent devoicing except word-finally
  • voi??voi/, ?son plus ?? to block
    word-finally
  • MAXµ NOCODA Only geminate codas /C µ/
  • MAXµ WEIGHT-TO-STRESS
  • Geminates are the only codas in unstressed
    syllables
  • C?Ø/s? plus exclusivity of association

37
? Structure Jakobsons ProgramMarkedness
Faithfulness Harmony
  • In summary
  • Jakobsons key insight concerning linguistic
    structure the central organizing principle of
    grammar is Minimize Markedness
  • OT formalizes this as Maximize Harmony
  • OT formalizes Markedness via violable constraints
  • OT adds the crucial notion of Faithfulness the
    other (lexical) half of the phonological
    dialectic
  • OT Harmony combines Markedness with Faithfulness
    their conflict is adjudicated via ranking
  • Ranking unifies multiple dimensions of markedness

38
? Structure Summary
  • OT achieves the explanatory goals of
  • Changing the epiphenomenal status of markedness
    in grammatical theory markedness is now in
    grammar, not about grammar
  • A strongly universalist formalism exhibiting
    Inherent Typology
  • Robust falsifiability

39
Responsibilities of Grammatical Theory
Chomskys Big 4 questions concerning knowledge
of grammar
?
OT
Structure
Acquisition
Processing
Neuro-genetics
40
? Acquisition Formal Result ILearning Theory
  • Learning algorithm
  • Provably correct and efficient (when part of a
    general decomposition of the grammar learning
    problem)
  • Sources
  • Tesar 1995 et seq.
  • Tesar Smolensky 1993, , 2000
  • See for how to exploit the analogy to weighted
    OT (Goldsmith, today)
  • If you hear A when you expected to hear E,
    increase the Harmony of A above that of E by
    minimally demoting each constraint violated by A
    below a constraint violated by E

41
? Acquisition Formal Result IConstraint
Demotion Algorithm
If you hear A when you expected to hear E,
increase the Harmony of A above that of E by
minimally demoting each constraint violated by A
below a constraint violated by E
Correctly handles difficult case multiple
violations in E
42
? Acquisition Conceptual QuestionLarge
Grammar Space
  • Huge number of grammars OT is too
    unrestrictive

43
? Acquisition Formal Result IILearnability
the Initial State
  • M F is learnable with /inpossible/?impossible
  • not in- except when followed by
  • exception that proves the rule M NPA
  • M F is not learnable from data if there are no
    exceptions (alternations) of this sort, e.g.,
    if no affixes and all underlying morphemes have
    mp ?M and ?F, no M vs. F conflict, no evidence
    for their ranking
  • Thus must have M F in the initial state, H0

44
? Acquisition Empirical ApplicationInitial
State Experimental Test
  • Collaborators
  • Peter Jusczyk
  • Theresa Allocco
  • (Elliott Moreton, Karen Arnold)
  • Here, only a thumbnail sketch (more in the OT
    Workshop Thursday)

45
? Acquisition Empirical ApplicationInitial
State Experimental Test
  • Linking hypothesis
  • More harmonic phonological stimuli ? Longer
    listening time
  • More harmonic
  • ?M ? M, when equal on F
  • ?F ? F, when equal on M
  • When must chose one or the other, more harmonic
    to satisfy M M F
  • M Nasal Place Assimilation (NPA)

46
4.5 Months (NPA)
? Acquisition Empirical Application
47
? Acquisition Empirical Application
4.5 Months (NPA)
48
4.5 Months (NPA)
? Acquisition Empirical Application
49
4.5 Months (NPA)
? Acquisition Empirical Application
50
? Acquisition Jakobsons ProgramMarkedness
Distance from Initial State
  • X is universally more marked than Y
  • In addition to the constraints M1, M2, , Mk
    violated by Y, X also violates markedness
    constraints M?1, M?2, , M?n
  • Y will be acquired become admitted into the
    childs inventory after M1, M2, Mn are all
    demoted below relevant faithfulness constraints
  • These demotions are all necessary for X to be
    acquired, and additional demotions of M?1, M?2,
    , M?n are also required
  • X will require more time to be acquired

51
Responsibilities of Grammatical Theory
Chomskys Big 4 questions concerning knowledge
of grammar
?
OT
Structure
?
?
Nativist hypothesis
Acquisition
Processing
Neuro-genetics
52
? Processing Formal ResultsContext-Free Parsing
Algorithm
  • Theorem (Tesar 1994, 1995b, a, 1996). Suppose
  • Gen parses a string of input symbols into
    structures specified via a context-free grammar
  • Con constraints meet a tree-locality condition
    and penalize empty structure
  • Then a given dynamic programming algorithm is
  • Left-to-right
  • General (any such Gen, Con)
  • Guaranteed to find the optimal outputs
  • As efficient as parsers for conventional
    context-free grammars.

53
? Processing Formal ResultsFinite-State Parsing
Algorithm
  • Theorem (Ellison 1994). Suppose
  • Gen(I) is representable as a (non-deterministic)
    finite-state transducer (particular to I) mapping
    the input string to a set of output candidates
  • Con constraints are reducible to
    multiply-violable binary constraints each
    representable as a finite-state transducer
    mapping an output candidate to a sequence of
    violation marks
  • Then composing the Gen(I) and rank-sequenced
    constraint-transducers yields a transducer that
  • Directly maps I to its optimal outputs
  • Can be efficiently pruned by dynamic programming

54
? Processing Formal ResultsComplexity of
Violable Constraints
  • Theorem (Frank and Satta 1998). Suppose
  • Gen is representable as a (non-deterministic)
    finite-state transducer mapping an input string
    to a set of output candidates
  • Con the set of structures incurring n violations
    of each constraint is generable by a finite-state
    machine, and n can be finitely bounded for each
    constraint
  • Then the mapping from inputs to optimal outputs
    has the complexity of a finite-state transducer.
  • Theorem (Hiller 1996, Smolensky 1997).
  • If n is unbounded there are (extremely simple) OT
    grammars with greater computational complexity.

55
? Processing Conceptual QuestionProcessing
(Symbolic) Theory
  • Infinite candidate set uncomputable

56
? Processing Empirical ApplicationSentence
Processing
  • Because an OT grammar assigns a parse to any
    input, no additional principles (e.g., parsing
    heuristics) are needed for parsing the initial,
    incomplete segment of a sentence
  • Linking hypothesis
  • Processing difficulty arises when previously
    established structure needs to be abandoned in
    the face of further input

57
? Processing Empirical ApplicationPP Attachment
The servant of the actress who (Cuetos
Mitchell 88)
Assuming who is ambiguous for Case.
Violates NOM, LOCALITY2
Violates NOM, AGRCASE
Violates GEN
  • LOCALITY If XP c-commands YP, then XP precedes
    YP.
  • AGRCASE A relative pronoun must agree in Case
    with the modified NP.
  • CASE GEN DAT ACC NOM (universal)

58
? Processing Empirical ApplicationPP Attachment
The servant of the actress who (Cuetos
Mitchell 88)
  • If GEN, AGRCASE LOCALITY2, then ? ?
    attach high
  • If LOCALITY2 GEN or AGRCASE, then ? ? or ?
    attach low

59
? Processing Empirical ApplicationPP Attachment
  • Preliminary result A cross-linguistic typology
    of PP attachment patterns (across differences in
    case and embedding depth)
  • Empirically promising, but not perfect
  • Unclear yet how rankings determining parsing
    preferences relate to rankings in the pure
    competence grammar

60
? Processing Jakobsons ProgramProcessing and
Markedness
  • Phonological analogy Incrementally parse CVC
  • /C/ ? C
  • /CV/ ? CV
  • /CVC/ ? CVC
  • Now expect a V if get it, no reanalysis
  • But if get a C, need reanalysis ? difficulty
  • /CVCC/ ? CVCC
  • Processing marked material (coda C) creates
    difficulty because it is initially analyzed as
    unmarked (as an onset)

61
? Processing Conceptual QuestionProcessing
(Symbolic) Theory
  • OT not psychologically plausible

62
Responsibilities of Grammatical Theory
Chomskys Big 4 questions concerning knowledge
of grammar
?
OT
Structure
?
?
Nativist hypothesis
Acquisition
?
Processing
Neuro-genetics
63
? Neuro-genetics Formal ResultsNeural
Representations (Gen)
64
OT Connectionism
  • OT derives from the numerical formalism, derived
    from connectionist Harmony maximization, of
  • Harmonic Grammar (Legendre, Miyata, Smolensky,
    1990)

65
? Neuro-genetics Formal Results Neural
Constraints (Con)
NOCODA A syllable has no coda
H(as k æ t) sNOCODA lt 0
66
? Neuro-genetics Formal Results UGenome for CV
Theory
  • The game take a first shot at a concrete example
    of a genetic encoding of UG in a Language
    Acquisition Device
  • Proteins ? Universal grammatical principles ?
  • Case study Basic CV Syllable Theory
  • Introduce an abstract genome notion parallel to
    (and encoding) abstract neural network
  • Collaborators
  • Melanie Soderstrom
  • Donald Mathis

67
? Neuro-genetics Formal ResultsNetwork
Architecture
  • /C1 C2/ ? C1 V C2

/C1 C2 /
C1 V C2
68
? Neuro-genetics Formal ResultsPARSE
  • All connection coefficients are 2

69
? Neuro-genetics Formal ResultsONSET
  • All connection coefficients are ?1

70
? Neuro-genetics Formal ResultsConnectivity
geometry
  • Assume 3-d grid geometry

71
? Neuro-genetics Formal ResultsConstraint PARSE
  • Input units grow south and connect
  • Output units grow east and connect
  • Correspondence units grow north west and
    connect with input output units.

72
? Neuro-genetics Formal ResultsConnectivity
Genome
  • Contributions from ONSET and PARSE
  • Key

73
? Neuro-genetics Formal ResultsProcessing
74
? Neuro-genetics Formal ResultsLearning
75
? Neuro-genetics Formal ResultsLearning Behavior
  • A simplified system can be solved analytically
  • Learning algorithm turns out to
  • Dsi(?) e violations of constrainti P?

76
Conclusion
  • OT is enabling progress on several explanatory
    goals for linguistic theory
  • ? Inherent typology
  • ? General learning theory
  • ? General processing theory
  • General biological realization

Often, OT formalizes Jakobsons program
Thank you for your attention
Write a Comment
User Comments (0)
About PowerShow.com