Title: Paul Smolensky
1The Harmonic Mind
- Paul Smolensky
- Cognitive Science Department
- Johns Hopkins University
with
Géraldine Legendre Donald Mathis Melanie
Soderstrom
Alan Prince Suzanne Stevenson Peter Jusczyk
2Advertisement
The Harmonic Mind From neural computation to
optimality-theoretic grammar  Paul
Smolensky  Géraldine Legendre
- Blackwell 2003 (??)
- Develop the Integrated Connectionist/Symbolic
(ICS) Cognitive Architecture - Case study in formalist multidisciplinary
cognitive science (point out imports/exports of
ICS)
3Cognitive Science 101
- Computation is cognition
- But what type?
- Fundamental question of research on the human
cognitive architecture
4Table of Contents
5Table of Contents
- Implications of architecture for nativism
- Learnability
- Initial state
- Experimental test Infants
- (Genomic encoding of UG)
6Processing Algorithm Activation
- Computational neuroscience ? ICS
- Key sources
- Hopfield 1982, 1984
- Cohen and Grossberg 1983
- Hinton and Sejnowski 1983, 1986
- Smolensky 1983, 1986
- Geman and Geman 1984
- Golden 1986, 1988
Processing spreading activation is
optimization Harmony maximization
7Function Optimization
- Cognitive psychology ? ICS
- Key sources
- Hinton Anderson 1981
- Rumelhart, McClelland, the PDP Group 1986
CONFLICT!!
8Representation
- Symbolic theory ? ICS
- Complex symbol structures
- Generative linguistics ? ICS
- Particular linguistic representations
- PDP connectionism ? ICS
- Distributed activation patterns
- ICS
- realization of (higher-level) complex symbolic
structures in distributed patterns of activation
over (lower-level) units (tensor product
representations etc.)
9Representation
10Knowledge Constraints
NOCODA A syllable has no coda
H(as k æ t) sNOCODA lt 0
11Constraint Interaction I
- ICS ? Grammatical theory
- Harmonic Grammar
- Legendre, Miyata, Smolensky 1990 et seq.
12Constraint Interaction I
The grammar generates the representation that
maximizes H this best-satisfies the constraints,
given their differential strengths
Any formal language can be so generated.
13Harmonic Grammar Parser
- Simple, comprehensible network
- Simple grammar G
- X ? A B Y ? B A
- Language
-
Completion
14Harmonic Grammar Parser
15Harmonic Grammar Parser
?
16Harmonic Grammar Parser
17Harmonic Grammar Parser
18Harmonic Grammar Parser
H(Y, B) gt 0H(Y, A) gt 0
- Weight matrix for Y ? B A
19Harmonic Grammar Parser
- Weight matrix for entire grammar G
20Bottom-up Processing
21Top-down Processing
22Scaling up
- Not yet
- Still conceptual obstacles to surmount
23Constraint Interaction II OT
- ICS ? Grammatical theory
- Optimality Theory
- Prince Smolensky 1993
24Constraint Interaction II OT
- Differential strength encoded in strict
domination hierarchies - Every constraint has complete priority over all
lower-ranked constraints (combined) - Approximate numerical encoding employs special
(exponentially growing) weights
25Constraint Interaction II OT
- Stress is on the initial heavy syllable iff the
number of light syllables n obeys
No way, man
26Constraint Interaction II OT
- Constraints are universal
- Human grammars differ only in how these
constraints are ranked - factorial typology
- First true contender for a formal theory of
cross-linguistic typology
27The Faithfulness / Markedness Dialectic
- cat /kat/ ? kæt NOCODA why?
- FAITHFULNESS requires identity
- MARKEDNESS often opposes it
- Markedness-Faithfulness dialectic ? diversity
- English NOCODA FAITH
- Polynesian FAITH NOCODA (French)
- Another markedness constraint M
- Nasal Place Agreement Assimilation (NPA)
- mb ? nb, ?b nd ? md, ?d ?g ? ?b,
?d - labial coronal
velar
28Nativism I Learnability
- Learning algorithm
- Provably correct and efficient (under strong
assumptions) - Sources
- Tesar 1995 et seq.
- Tesar Smolensky 1993, , 2000
- If you hear A when you expected to hear E,
minimally demote each constraint violated by A
below a constraint violated by E
29Constraint Demotion Learning
- If you hear A when you expected to hear E,
minimally demote each constraint violated by A
below a constraint violated by E
Correctly handles difficult case multiple
violations in E
30Nativism I Learnability
- M F is learnable with /inpossible/?impossible
- not in- except when followed by
- exception that proves the rule M NPA
- M F is not learnable from data if there are no
exceptions (alternations) of this sort, e.g.,
if no affixes and all underlying morphemes have
mp vM and vF, no M vs. F conflict, no evidence
for their ranking - Thus must have M F in the initial state, H0
31Nativism II Experimental Test
- Linking hypothesis
- More harmonic phonological stimuli ? Longer
listening time - More harmonic
- ?M ? M, when equal on F
- ?F ? F, when equal on M
- When must choose one or the other, more harmonic
to satisfy M M F - M Nasal Place Assimilation (NPA)
- Collaborators
- Peter Jusczyk
- Theresa Allocco
- (Elliott Moreton, Karen Arnold)
32Experimental Paradigm
- Headturn Preference Procedure (Kemler Nelson et
al. 95 Jusczyk 97)
- X/Y/XY paradigm (P. Jusczyk)
- un...b?...umb?
- un...b?...umb?
FNP
R
p .006
?FAITH
- Highly general paradigm Main result
334.5 Months (NPA)
344.5 Months (NPA)
354.5 Months (NPA)
364.5 Months (NPA)
37Nativism III UGenome
- Can we combine
- Connectionist realization of harmonic grammar
- OTs characterization of UG
- to examine the biological plausibility of UG as
innate knowledge? - Collaborators
- Melanie Soderstrom
- Donald Mathis
- Oren Schwartz
38Nativism III UGenome
- The game take a first shot at a concrete example
of a genetic encoding of UG in a Language
Acquisition Device - Introduce an abstract genome notion parallel to
(and encoding) abstract neural network - Is connectionist empiricism clearly more
biologically plausible than symbolic nativism? - No!
39Summary
- Described an attempt to integrate
- Connectionist theory of mental processes
- (computational neuroscience, cognitive
psychology) - Symbolic theory of
- Mental functions (philosophy, linguistics)
- Representations
- General structure (philosophy, AI)
- Specific structure (linguistics)
- Informs theory of UG
- Form, content
- Genetic encoding
40The Problem
- No concrete examples of such a LAD exist
- Even highly simplified cases pose a hard problem
- How can genes which regulate production of
proteins encode symbolic principles of
grammar? - Test preparation Syllable Theory
41Approach Multiple Levels of Encoding
Biological Genome
42Basic syllabification Function
- /underlying form/ ? surface form
- Plural form of dish
- /d?s/ ? .d?. ? z.
- /CVCC/ ? .CV.C V C.
43Basic syllabification Function
- /underlying form/ ? surface form
- Plural form of dish
- /d?s/ ? .d?. ? z.
- /CVCC/ ? .CV.C V C.
- Basic CV Syllable Structure Theory
- Prince Smolensky 1993 Chapter 6
- Basic No more than one segment per syllable
position .(C)V(C).
44Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to
an element in the output - FILLV/C Every output V/C segment corresponds to
an input V/C segment - ONSET No V without a preceding C
- NOCODA No C without a following V
45SAnet architecture
/C1 C2 /
C1 V C2
46Connection substructure
47PARSE
- All connection coefficients are 2
48ONSET
- All connection coefficients are ? 1
49Crucial Open Question(Truth in Advertising)
- Relation between strict domination and neural
networks? - Apparently not a problem in the case of the CV
Theory
50To be encoded
- How many different kinds of units are there?
- What information is necessary (from the source
units point of view) to identify the location of
a target unit, and the strength of the connection
with it? - How are constraints initially specified?
- How are they maintained through the learning
process?
51Unit types
- Input units C V
- Output units C V x
- Correspondence units C V
- 7 distinct unit types
- Each represented in a distinct sub-region of the
abstract genome - Help ourselves to implicit machinery to spell
out these sub-regions as distinct cell types,
located in grid as illustrated
52Connectivity geometry
53Constraint PARSE
- Input units grow south and connect
- Output units grow east and connect
- Correspondence units grow north west and
connect with input output units.
54Constraint ONSET
- Short connections grow north-south between
adjacent V output units, - and between the first V node and the first x
node.
55Direction of projection growth
- Topographic organizations widely attested
throughout neural structures - Activity-dependent growth a possible alternative
- Orientation information (axes)
- Chemical gradients during development
- Cell age a possible alternative
56Projection parameters
- Direction
- Extent
- Local
- Non-local
- Target unit type
- Strength of connections encoded separately
57Connectivity Genome
- Contributions from ONSET and PARSE
58ONSET
x0 segment S S VO
N S x0
59Learning Behavior
- Simplified system can be solved analytically
- Learning algorithm turns out to
- Dsi(?) e violations of constrainti P?
60Possible Conclusions
- Empiricist connectionism is not more
biologically plausible than nativist
connectionism - (except possibly local vs. distributed
representations) - It might be possible to do evolutionary
simulation (not mathematical analysis) in a space
including bona fide LADs - I must have too much time on my hands
61Thanks for your attention