Title: Markedness Optimization in Grammar and Cognition
 1Markedness Optimization in Grammar and Cognition
- Paul Smolensky 
- Cognitive Science Department 
- Johns Hopkins University
with
Géraldine Legendre Alan Prince Peter Jusczyk 
 Suzanne Stevenson
Elliott Moreton Karen Arnold Donald 
Mathis Melanie Soderstrom  
 2Grammar and Cognition
-  1. What is the system of knowledge? 
-  2. How does this system of knowledge arise in 
 the mind/brain?
-  3. How is this knowledge put to use? 
-  4. What are the physical mechanisms that serve 
 as the material basis for this system of
 knowledge and for the use of this knowledge?
-  (Chomsky 88 p. 3)
3Jakobsons Program
- A Grand Unified Theory for the cognitive science 
 of language is enabled by Markedness
-  Avoid a 
- ? Structure 
- Alternations eliminate a 
- Typology Inventories lack a 
- ? Acquisition 
- a is acquired late 
- ? Processing 
- a is processed poorly 
- ? Neural 
- Brain damage most easily disrupts a
Formalize through OT? 
 4Advertisement
- The complete story, forthcoming (2003) Blackwell 
- The harmonic mind From neural computation to 
 optimality-theoretic grammar
- Smolensky  Legendre 
Overview 
 5 Structure Acquisition Use Neural Realization
- ? Theoretical. OT (Prince  Smolensky 91, 93) 
 
- Construct formal grammars directly from 
 markedness principles
- General formalism/ framework for grammars 
 phonology, syntax, semantics GB/LFG/
- Strongly universalist inherent typology 
- ? Empirical. OT 
- Allows completely formal markedness-based 
 explanation of highly complex data
6 Structure Acquisition Use Neural Realization
- Theoretical Formal structure enables OT-general 
- Learning algorithms 
- Constraint Demotion Provably correct and 
 efficient (when part of a general decomposition
 of the grammar learning problem)
- Tesar 1995 et seq. 
- Tesar  Smolensky 1993, , 2000 
- Gradual Learning Algorithm 
-  Boersma 1998 et seq.
- ? Empirical 
- Initial state predictions explored through 
 behavioral experiments with infants
7 Structure Acquisition Use Neural Realization
- Theoretical 
- Theorems regarding the computational complexity 
 of algorithms for processing with OT grammars
- Tesar 94 et seq. 
- Ellison 94 
- Eisner 97 et seq. 
- Frank  Satta 98 
- Karttunen 98
8 Structure Acquisition Use Neural Realization
- Theoretical OT derives from the theory of 
 abstract neural (connectionist) networks
- via Harmonic Grammar (Legendre, Miyata, Smolensky 
 90)
-  For moderate complexity, now have general 
 formalisms for realizing
- complex symbol structures as distributed patterns 
 of activity over abstract neurons
- structure-sensitive constraints/rules as 
 distributed patterns of strengths of abstract
 synaptic connections
- optimization of Harmony
? Construction of a miniature, concrete LAD 
 9Program
- Structure 
- ? OT 
- Constructs formal grammars directly from 
 markedness principles
- Strongly universalist inherent typology 
- ? OT allows completely formal markedness-based 
 explanation of highly complex data
- Acquisition 
- ? Initial state predictions explored through 
 behavioral experiments with infants
- Neural Realization 
- ? Construction of a miniature, concrete LAD
10Program
- Structure 
- ? OT 
- Constructs formal grammars directly from 
 markedness principles
- Strongly universalist inherent typology 
- ? OT allows completely formal markedness-based 
 explanation of highly complex data
- Acquisition 
- ? Initial state predictions explored through 
 behavioral experiments with infants
- Neural Realization 
- ? Construction of a miniature, concrete LAD
11? The Great Dialectic
- Phonological representations serve two masters
FAITHFULNESS
MARKEDNESS
 Locked in conflict 
 12OT from Markedness Theory
- MARKEDNESS constraints a No a 
- FAITHFULNESS constraints 
- Fa demands that /input/ ? output leave a 
 unchanged (McCarthy  Prince 95)
- Fa controls when a is avoided (and how) 
- Interaction of violable constraints Ranking 
- a is avoided when a  Fa 
- a is tolerated when Fa  a 
- M1  M2 combines multiple markedness dimensions
13OT from Markedness Theory
- MARKEDNESS constraints a 
- FAITHFULNESS constraints Fa 
- Interaction of violable constraints Ranking 
- a is avoided when a  Fa 
- a is tolerated when Fa  a 
- M1  M2 combines multiple markedness dimensions 
- Typology All cross-linguistic variation results 
 from differences in ranking  in how the
 dialectic is resolved (and in how multiple
 markedness dimensions are combined)
14OT from Markedness Theory
- MARKEDNESS constraints 
- FAITHFULNESS constraints 
- Interaction of violable constraints Ranking 
- Typology All cross-linguistic variation results 
 from differences in ranking  in resolution of
 the dialectic
- Harmony  MARKEDNESS  FAITHFULNESS 
- A formally viable successor to Minimize 
 Markedness is OTs Maximize Harmony (among
 competitors)
15 ? Structure
- Explanatory goals achieved by OT 
- Individual grammars are literally and formally 
 constructed directly from universal markedness
 principles
- Inherent Typology  
-  Within the analysis of phenomenon F in language 
 L is inherent a typology of F across all languages
16Program
- Structure 
- ? OT 
- Constructs formal grammars directly from 
 markedness principles
- Strongly universalist inherent typology 
- ? OT allows completely formal markedness-based 
 explanation of highly complex data
- Acquisition 
- ? Initial state predictions explored through 
 behavioral experiments with infants
- Neural Realization 
- ? Construction of a miniature, concrete LAD
17Markedness and Inventories
- Theoretical part 
- An inventory structured by markedness 
- An inventory I is harmonically complete (HC) iff 
- x ? I and y is (strictly) less marked than x 
- implies 
- y ? I 
- A typology structured by markedness 
- A typology T is strongly Harmonically complete 
 (SHarC) iff
- L ? T if and only if L is harmonically complete 
- (Prince  Smolensky 93 Ch. 9) 
- Are OT inventories harmonically complete? 
- Are OT typologies SHarC?
18Harmonic Completeness
-  English obstruent inventory is HC w.r.t. 
 Place/continuancy
Inventory Bans Only the Worst Of the Worst (BOWOW)
 but is not generable by ranking  velar, 
cont FPlace, Fcont  
 19Local Conjunction
- Crucial to distinguish 
- taxi 
- ?saki 
x w.r.t segment inventory cont, velar 
fatal in same segment
Local conjunction cont seg velar 
violated when both violated in same segment 
 20Basic Inventories/Typologies
- Formal analysis of HC/SHarC in OT Definitions 
- Basic inventory I F of elements of type T, 
 where F  fk
- Candidates X   ?f1, ?f2, ?f3, ?f4,   
- Con MARK   f1, ?f2,   
-  FAITH   Ff1, Ff2,   
- I F a ranking of Con 
- Basic typology T F All rankings of Con 
- Basic typology w/ Local Conjunction, T LCF All 
 rankings of ConLC  Con  all conjunctions of
 constraints in MARK, local to T
21SHarC Theorem
- SHarC Theorem 
- T F 
- each language is HC 
- SHarC property does not hold 
- TLCF 
- each language is HC 
- SHarC property holds
22Empirical Relevance
- Empirical part 
- Local conjunction has seen many empirical 
 applications here, vowel harmony
- Lango (Nilotic, Uganda) ATR harmony 
- Woock  Noonan 79 
- Archangeli  Pulleyblank 91 et seq., esp. 94 
- Markedness 
- ATR, ?hi/fr 
- ?ATR, hi/fr 
- A/sclosed 
- HD-LATR 
Rather than imposing a parametric superstructure 
on spreading rules (AP 94), we build the 
grammar directly from these markedness constraints 
 23Lango ATR Harmony
- Inventory of ATR domains D ATR ( tiers) 
- Vowel harmony renders many possibilities 
 ungrammatical yourSING/PLUR stew
-  d?k Cí ?  d? k k í ? dè kk í  d? kk 
 ? ATR ?  ?      0
 ?0 ?
-  d?kwú ? ?d?kwú  dèkwú  d?kw? 
- critical difference ifr vs. u?fr ?fr 
 worse source for ATR spread violates
 ATR, ?fr  marked w.r.t. ATR
- Complex system interaction of 6 dimensions (26  
 64 distinct environments)
24(No Transcript) 
 25d?k Cí ? dèkkí 
 26d?kwú ? d?kwú 
 27(No Transcript) 
 28The Challenge
- Need a grammatical framework able to handle this 
 nightmarish descriptive complexity
- while staying strictly within the confines of 
 rigidly universal principles
29Lango rules 
 rules 
a  
ß 
ATR    
ATR
ATR     
V C V  
V
(C)C
V     
 rules 
a  
b  
c 
ATR         
ATR
ATR
ATR
- Archangeli  Pulleyblank 94
V C V  
V (C)C V  
V (C)C V     
hi   
hi  
hi  
fr   
- 
 rule 
x 
ATR 
-  
ATR  
V (C)C V   
- 
hi  
-
fr    
 30(No Transcript) 
 31(No Transcript) 
 32cont seg velar
A/sclosed DA ?hi,A/HDA No ?ATR 
spread into a closed syllable from a ?hi 
source  
 33BOWOW ?hi, ?A  HD-L?A No regressive ?ATR 
spread from a ?hi source 
 34X,Y,Z ?A 1,2,3 A  AGREE  FA 
 35The Challenge
- Need a grammatical framework able to handle this 
 nightmarish descriptive complexity
- while staying strictly within the confines of 
 rigidly universal principles
36Inherent Typology
- Method applicable to related African languages, 
 where the same markedness constraints govern the
 inventory (Archangeli  Pulleyblank 94), but
 with different interactions different rankings
 and active conjunctions
- Part of a larger typology including a range of 
 vowel harmony systems
37? Structure Summary
- OT builds formal grammars directly from 
 markedness MARK, with FAITH
- Inventories consistent with markedness relations 
 are formally the result of OT with local
 conjunction TLCF, SHarC theorem
- Even highly complex patterns can be explained 
 purely with simple markedness constraints all
 complexity is in constraints interaction through
 ranking and conjunction Lango ATR harmony
38Program
- Structure 
- ? OT 
- Constructs formal grammars directly from 
 markedness principles
- Strongly universalist inherent typology 
- ? OT allows completely formal markedness-based 
 explanation of highly complex data
- Acquisition 
- ? Initial state predictions explored through 
 behavioral experiments with infants
- Neural Realization 
- ? Construction of a miniature, concrete LAD
39The Initial State
- OT-general  MARKEDNESS  FAITHFULNESS 
-  Learnability demands (Richness of the Base) 
-  (Alan Prince, p.c., 93 Smolensky 96a) 
- ? Child production restricted to the unmarked 
- ? Child comprehension not so restricted 
-  (Smolensky 96b)
40? Experimental Exploration of the Initial State
- Collaborators 
- Peter Jusczyk Theresa AlloccoLanguage 
 Acquisition 2002
- Karen Arnold Elliott Moretonin progress 
- Grammar at 4.5 months?
41Experimental Paradigm
-  Headturn Preference Procedure (Kemler Nelson et 
 al. 95 Jusczyk 97)
-  X/Y/XY paradigm (P. Jusczyk) 
-  un...b?...umb? 
-  un...b?...umb?
FNP
R 
p  .006
?FAITH
- Highly general paradigm Main result
42Linking Hypothesis
- Experimental results challenging to explain 
- Suppose stimuli A and B differ w.r.t. f. 
-  Child MARKf  FAITHf (M  F). Then 
- If A is consistent with M  F and  B is 
 consistent with F  M then prefer (attend
 longer to) A A gt B
- MARKf  Nasal Place Agreement
43Experimental Results
If A is consistent with M  F and  B is 
consistent with F  M then prefer (attend 
longer to) A A gt B
gt
mb ? mb 
nb ? nb
gt
?
gt
?
nb ? nd 
nb ? mb 
p lt .05 ?MARK
p lt .001 nb ? mb M  F 
p lt .05 n ? m detectable
p gt .40 /nb/ nd ?UG mb
p gt .30 UG ? unreliability 
 44Program
- Structure 
- ? OT 
- Constructs formal grammars directly from 
 markedness principles
- Strongly universalist inherent typology 
- ? OT allows completely formal markedness-based 
 explanation of highly complex data
- Acquisition 
- ? Initial state predictions explored through 
 behavioral experiments with infants
- Neural Realization 
- ? Construction of a miniature, concrete LAD
45A LAD for OT
- Acquisition 
- Hypothesis Universals are genetically encoded, 
 learning is search among UG-permitted grammars.
- Question Is this even possible? 
- Collaborators 
-  Melanie Soderstrom Donald Mathis
46 UGenomics
- The game Take a first shot at a concrete example 
 of a genetic encoding of UG in a Language
 Acquisition Device
-  Proteins ? Universal grammatical principles ?
Time to willingly suspend disbelief  
 47 UGenomics
- The game Take a first shot at a concrete example 
 of a genetic encoding of UG in a Language
 Acquisition Device
-  Proteins ? Universal grammatical principles ?
- Case study Basic CV Syllable Theory (Prince  
 Smolensky 93)
- Innovation Introduce a new level, an abstract 
 genome notion parallel to and encoding
 abstract neural network
48UGenome for CV Theory
- Three levels 
- Abstract symbolic Basic CV Theory 
- Abstract neural CVNet 
- Abstract genomic CVGenome
49UGenomics Symbolic Level
- Three levels 
- Abstract symbolic Basic CV Theory 
- Abstract neural CVNet 
- Abstract genomic CVGenome
50Basic syllabification Function
- Basic CV Syllable Structure Theory 
- Basic  No more than one segment per syllable 
 position .(C)V(C).
-  /underlying form/ ? surface form 
- /CVCC/ ? .CV.C V C. /pædd/?pæd?d 
- Correspondence Theory 
- McCarthy  Prince 1995 (MP) 
- /C1V2C3C4/ ? .C1V2.C3 V C4
51Syllabification Constraints (Con)
- PARSE Every element in the input corresponds to 
 an element in the output
- ONSET No V without a preceding C 
- etc.
52UGenomics Neural Level
- Three levels 
- Abstract symbolic Basic CV Theory 
- Abstract neural CVNet 
- Abstract genomic CVGenome
53CVNet Architecture
/ C1 C2 /
 C1 V C2  
 54Connections PARSE
- All connection coefficients are 2
55Connections ONSET
- All connection coefficients are ?1
56CVNet Dynamics
- Boltzmann machine/Harmony network 
- Hinton  Sejnowski 83 et seq.  Smolensky 83 et 
 seq.
- stochastic activation-spreading algorithm higher 
 Harmony ? more probable
- CVNet innovation connections realize fixed 
 symbol-level constraints with variable strengths
- learning modification of Boltzmann machine 
 algorithm to new architecture
57UGenomics Genome Level
- Three levels 
- Abstract symbolic Basic CV Theory 
- Abstract neural CVNet 
- Abstract genomic CVGenome
58Connectivity geometry
- Assume 3-d grid geometry (e.g., gradients)
59Connectivity PARSE
- Correspondence units grow north  west and 
 connect with input  output units.
- Output units grow east and connect
- Input units grow south and connect 
-  
60Connectivity ONSET
x0 segment  S S VO
 N S x0
  61Connectivity Genome
- Contributions from ONSET and PARSE
62CVGenome Connectivity 
 63Abstract Gene Map
General Developmental Machinery
Connectivity
Constraint Coefficients
 C-I
 V-I
 C-C
direction
extent
target
 CORRESPOND
 RESPOND
COVx B 1
CCVC B ?2
CC CICO 1
VC VIVO 1
G??
G??
?
? 
 64CVGenome Connection Coefficients 
 65UGenomics 
- Realization of processing and learning algorithms 
 in abstract molecular biology, using the types
 of interactions known to be biologically possible
 and genetically encodable
66UGenomics 
- Host of questions to address 
- Will this really work? 
- Can it be generalized to distributed nets? 
- Is the number of genes 770.26 plausible? 
- Are the mechanisms truly biologically plausible? 
- Is it evolvable?
? How is strict domination to be handled? 
 67Hopeful Conclusion
- Progress is possible toward a Grand Unified 
 Theory of the cognitive science of language
- addressing the structure, acquisition, use, and 
 neural realization of knowledge of language
- strongly governed by universal grammar 
- with markedness as the unifying principle 
- as formalized in Optimality Theory at the 
 symbolic level
- and realized via Harmony Theory in abstract 
 neural nets which are potentially encodable
 genetically
68Hopeful Conclusion
- Progress is possible toward a Grand Unified 
 Theory of the cognitive science of language
 Thank you for your attention (and indulgence)
Still lots of promissory notes, but all in a 
common currency  Harmony  unmarkedness 
hopefully this will promote further progress by 
facilitating integration of the sub-disciplines 
of cognitive science