Optimality in Cognition and Grammar - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

Optimality in Cognition and Grammar

Description:

From numerical to algebraic optimization in grammar. OT and nativism ... (Shastri & Ajjanagadde 1993) r1 [fbook fgive-obj] [Tesar & Smolensky 1994] ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 63
Provided by: paulsmo
Category:

less

Transcript and Presenter's Notes

Title: Optimality in Cognition and Grammar


1
Optimality in Cognition and Grammar
  • Paul Smolensky
  • Cognitive Science Department, Johns Hopkins
    University
  • Plan of lectures
  • Cognitive architecture Symbols optimization
    in neural networks
  • Optimization in grammar HG ? OTFrom numerical
    to algebraic optimization in grammar
  • OT and nativismThe initial state
    neural/genomic encoding of UG
  • ?

2
The ICS Hypothesis
  • The Integrated Connectionist/Symbolic Cognitive
    Architecture (ICS)
  • In higher cognitive domains, representations and
    fuctions are well approximated by symbolic
    computation
  • The Connectionist Hypothesis is correct
  • Thus, cognitive theory must supply a
    computational reduction of symbolic functions to
    PDP computation

3
Levels
4
The ICS Architecture
5
Representation
6
Tensor Product Representations
  • Representations

Depth 0
?
7
Local tree realizations
  • Representations

8
The ICS Isomorphism
Tensor product representations
Tensorial networks
?
9
Tensor Product Representations
10
Binding by Synchrony ?
r1 ? fbook fgive-obj
time
give(John, book, Mary)(Shastri Ajjanagadde
1993)
  • s r1 ? fbook fgive-obj r3 ? fMary
    frecipient r2 ? fgiver fJohn

Tesar Smolensky 1994
11
The ICS Architecture
12
Two Fundamental Questions
? Harmony maximization is satisfaction of
parallel, violable constraints
  • 2. What are the constraints?
  • Knowledge representation
  • Prior question
  • 1. What are the activation patterns data
    structures mental representations evaluated
    by these constraints?

13
Representation
14
Two Fundamental Questions
? Harmony maximization is satisfaction of
parallel, violable constraints
  • 2. What are the constraints?
  • Knowledge representation
  • Prior question
  • 1. What are the activation patterns data
    structures mental representations evaluated
    by these constraints?

15
Constraints
NOCODA A syllable has no coda Maori/French/Engli
sh
H(as k æ t) sNOCODA lt 0
16
The ICS Architecture
kæt
skæt
A
17
The ICS Architecture
kæt
skæt
A
18
Constraint Interaction I
  • ICS ? Grammatical theory
  • Harmonic Grammar
  • Legendre, Miyata, Smolensky 1990 et seq.

19
Constraint Interaction I
The grammar generates the representation that
maximizes H this best-satisfies the constraints,
given their differential strengths
Any formal language can be so generated.
20
The ICS Architecture
?
G
kæt
skæt
A
21
Harmonic Grammar Parser
  • Simple, comprehensible network
  • Simple grammar G
  • X ? A B Y ? B A
  • Language

Processing Completion
22
The ICS Architecture
23
Simple Network Parser
  • Fully self-connected, symmetric network
  • Like previously shown network

Except with 12 units representations and
connections shown below
24
Harmonic Grammar Parser
H(Y, A) gt 0H(Y, B) gt 0
  • Weight matrix for Y ? B A

25
Harmonic Grammar Parser
  • Weight matrix for X ? A B

26
Harmonic Grammar Parser
  • Weight matrix for entire grammar G

27
Bottom-up Processing
28
Top-down Processing
29
Scaling up
  • Not yet
  • Still conceptual obstacles to surmount

30
Explaining Productivity
  • Approaching full-scale parsing of formal
    languages by neural-network Harmony maximization
  • Have other networks (like PassiveNet) that
    provably compute recursive functions
  • !? productive competence
  • How to explain?

31
1. Structured representations
32
2. Structured connections
33
Proof of Productivity
  • Productive behavior follows mathematically from
    combining
  • the combinatorial structure of the vectorial
    representations encoding inputs outputs
  • and
  • the combinatorial structure of the weight
    matrices encoding knowledge

34
Explaining Productivity I
PSA ICS
Intra-level decomposition A B ? A, B
Inter-level decomposition A B ? 1,0,?1,,1
ICS
35
Explaining Productivity II
Functions Semantics
ICS PSA
Intra-level decomposition G ? X?AB, Y?BA

Inter-level decomposition W(G ) ? 1,0,?1,0
36
The ICS Architecture
37
The ICS Architecture
38
Constraint Interaction II OT
  • ICS ? Grammatical theory
  • Optimality Theory
  • Prince Smolensky 1991, 1993/2004

39
Constraint Interaction II OT
  • Differential strength encoded in strict
    domination hierarchies ()
  • Every constraint has complete priority over all
    lower-ranked constraints (combined)
  • Approximate numerical encoding employs special
    (exponentially growing) weights
  • Grammars cant count

40
Constraint Interaction II OT
  • Grammars cant count
  • Stress is on the initial heavy syllable iff the
    number of light syllables n obeys

No way, man
41
Constraint Interaction II OT
  • Differential strength encoded in strict
    domination hierarchies ()
  • Constraints are universal (Con)
  • Candidate outputs are universal (Gen)
  • Human grammars differ only in how these
    constraints are ranked
  • factorial typology
  • First true contender for a formal theory of
    cross-linguistic typology
  • 1st innovation of OT constraint ranking
  • 2nd innovation Faithfulness

42
The Faithfulness/Markedness Dialectic
  • cat /kat/ ? kæt NOCODA why?
  • FAITHFULNESS requires pronunciation lexical
    form
  • MARKEDNESS often opposes it
  • Markedness-Faithfulness dialectic ? diversity
  • English FAITH NOCODA
  • Polynesian NOCODA FAITH (French)
  • Another markedness constraint M
  • Nasal Place Agreement Assimilation (NPA)

?g ? ?b, ?d velar
nd ? md, ?d coronal
mb ? nb, ?b labial
43
The ICS Architecture
44
Optimality Theory
  • Diversity of contributions to theoretical
    linguistics
  • Phonology phonetics
  • Syntax
  • Semantics pragmatics
  • e.g., following lectures. Now
  • Can strict domination be explained by
    connectionism?

45
Case study
  • Syllabification in Berber
  • Plan
  • Data, then

OT grammar Harmonic Grammar Network
46
Syllabification in Berber
  • Dell Elmedlaoui, 1985 Imdlawn Tashlhit Berber
  • Syllable nucleus can be any segment
  • But driven by universal preference for nuclei to
    be highest-sonority segments

47
Berber syllable nuclei have maximal sonority
48
OT Grammar BrbrOT
  • HNUC A syllable nucleus is sonorous
  • ONSET A syllable has an onset

Strict Domination
Prince Smolensky 93/04
49
Harmonic Grammar BrbrHG
  • HNUC A syllable nucleus is sonorous
  • Nucleus of sonority s Harmony 2s?1
  • s ? 1, 2, , 8 t, d, f, z, n, l, i, a
  • ONSET VV Harmony ?28
  • Theorem. The global Harmony maxima are the
    correct Berber core syllabifications
  • of Dell Elmedlaoui no sonority plateaux, as
    in OT analysis, here henceforth

50
BrbrNet realizes BrbrHG
51
BrbrNets Global Harmony Maximum is the correct
parse
  • Contrasts with Goldsmiths Dynamic Linear Models
    (Goldsmith Larson 90 Prince 93)
  • For a given input string, a state of BrbrNet is
    a global Harmony maximum if and only if it
    realizes the syllabification produced by the
    serial Dell-Elmedlaoui algorithm

52
BrbrNets Search Dynamics
  • Greedy local optimization
  • at each moment, make a small change of state so
    as to maximally increase Harmony
  • (gradient ascent mountain climbing in fog)
  • guaranteed to construct a local maximum

53
/txznt/ ? tx.znt yousing stored
H
54
The Hardest Case 12378/t.bx.ya
hypothetical, but compare t.bx.la.kkwshe
even behaved as a miser tbx.lakkw
55
Subsymbolic Parsing
V
V
V
V
V
V
V
V
56
Parsing sonority profile 8121345787
a.tb.kf.zn.yay
Finds best of infinitely many representations102
4 corners/parses
57
BrbrNet has many Local Harmony Maxima
  • An output pattern in BrbrNet is a local Harmony
    maximum if and only if it realizes a sequence of
    legal Berber syllables (i.e., an output of Gen)
  • That is, every activation value is 0 or 1, and
    the sequence of values is that realizing a
    sequence of substrings taken from the syllable
    inventory CV, CVC, V, VC,
  • where C 0, V 1 and word edge
  • Greedy optimization avoids local maxima why?

58
HG ? OTs Strict Domination
  • Strict Domination Baffling from a connectionist
    perspective?
  • Explicable from a connectionist perspective?
  • Exponential BrbrNet escapes local H maxima
  • Linear BrbrNet does not

59
Linear BrbrNet makes errors
  • ( Goldsmith-Larson network)
  • Error /12378/ ? .123.78. (correct .1.23.78.)

60
Subsymbolic Harmony optimization can be stochastic
  • The search for an optimal state can employ
    randomness
  • Equations for units activation values have
    random terms
  • pr(a) ? eH(a)/T
  • T (temperature) randomness ? 0 during search
  • Boltzmann Machine (Hinton and Sejnowski 1983,
    1986) Harmony Theory (Smolensky 1983, 1986)
  • Can guarantee computation of global optimum in
    principle
  • In practice how fast? Exponential vs. linear
    BrbrNet

61
Stochastic BrbrNetExponential can succeed fast
  • 5-run average

62
Stochastic BrbrNet Linear cant succeed fast
63
Stochastic BrbrNet (Linear)
5-run average
64
The ICS Architecture
Write a Comment
User Comments (0)
About PowerShow.com