Symbolic vs Subsymbolic, Connectionism (an Introduction) - PowerPoint PPT Presentation

About This Presentation

Title:

Symbolic vs Subsymbolic, Connectionism (an Introduction)

Description:

Follow up to first symbolic ... Plug constituents in according to rules ... which, learn individual (substitutable) grapheme phoneme mappings and then ... – PowerPoint PPT presentation

Number of Views:219

Avg rating:3.0/5.0

Slides: 24

Provided by: hb85

Category:

more less

Transcript and Presenter's Notes

Title: Symbolic vs Subsymbolic, Connectionism (an Introduction)

1
Symbolic vs Subsymbolic, Connectionism (an
Introduction)

H. Bowman
(CCNCS, Kent)

2
Overview

Follow up to first symbolic subsymbolic talk
Motivation,
clarify why (typically) connectionist networks
are not compositional
introduce connectionism,
link to biology
activation dynamics
learning algorithms

3
Recap
4
A (Rather Naïve) Reading Model
PHONOLOGY
ORTHOGRAPHY
5
Compositionality

Plug constituents in according to rules
Structure of expressions indicates how they
should be interpreted
Semantic Compositionality,
the semantic content of a (molecular)
representation is a function of the semantic
contents of its syntactic parts, together with
its constituent structure
Fodor Pylyshyn,88
Symbolists argue compositionality is a defining
characteristic of cognition

6
Semantic Compositionality in Symbol Systems

Meanings of items plugged in as defined by syntax

M X denotes meaning of X
M John loves Jane . M loves
....
M John
M Jane
7
Semantic Compositionality Continued

Meanings of atoms constant across different
compositions

M Jane loves John . M loves
....
M Jane
M John
8
The Sub-symbolic Tradition
9
Rate Coding Hypothesis

Biological neurons fire spikes (pulses of
current)
In artificial neural networks,
nodes reflect populations of biological neurons
acting together, i.e. cell assemblies
activation reflects rate of spiking of underlying
biological neurons.

10
Activation in Classic Artificial Neural Network
Model
Positive weights Excitation Negative weights
Inhibition
output - yj
activation value - yj
node j
net input - hj
11
Sigmoidal Activation Function
Saturation unresponsive at high net
inputs Threshold unresponsive at low net inputs
Responsive around net input of 0
12
Characteristics

Nodes homogeneous and essentially dumb
Input weights characterize what a node represents
/ detects
Sophisticated (intelligent?) behaviour emerges
from interaction amongst nodes

13
Learning

directed weight adjustment
two basic approaches,
Hebbian learning,
unsupervised
extracting regularities from environment
error-driven learning,
supervised
learn an input to output mapping

14
Example Simple Feedforward Network
Use term PDP (Parallel Distributed Processing)

weights initially set randomly
trained according to set of input to output
patterns
error-driven,
for each input, adjust weights according to
extent to which in error

Output
Hidden
Input
15
Error-driven Learning

can learn any (computable) input-output mapping
(modulo local minima)
delta rule and back-propagation
network learning completely determined by
patterns presented to it

16
Example Connectionist Model

Jane Loves John difficult to represent in PDP
models
Word reading as an example
orthography to phonology
Words of four letters or less
Need to represent order of letters, otherwise,
e.g. slot and lots the same
Slot coding

17
A (Rather Naïve) Reading Model
PHONOLOGY
ORTHOGRAPHY
18
pronunciation of a as an example

Illustration 1 assume a realistic pattern set,
a pronounced differently,
in different positions
with different surrounding letters (context),
e.g. mint - pint
both built into patterns
frequency asymmetries,
how often a appears at different positions
throughout language reflects how effectively
pronounced at different positions
strange prediction if child only seen a in
positions 1 to 3, reach state in which (broadly)
can pronounce a in positions 1 to 3, but not at
all in position 4 that is, cannot even guess at
pronunciation, i.e. get random garbage!
labelling externally imposed no requirement that
the label a interpreted the same in different
slots
in symbol systems, every occurrence of a
interpreted identically

contextual influences can be beneficial, for
example,
reflecting irregularities, e.g. mint pint
pronouncing non-words, e.g. wug
Nonetheless, highly non-compositional no sense
to which plug in constituent representations
can only recognise (and pronounce) a in specific
contexts, but not at all in others.
surely, sense to which, learn individual
(substitutable) grapheme phoneme mappings and
then plug them in (modulo contextual influences).

Illustration 2 assume artificial pattern set in
which a mapped in each position to same
representation.
(assuming enough training) in sense, a in all
positions similarly represented
but,
not actually identical,
random initial weight settings imply different
(although similar) hidden layer representations
perhaps glossed over by thresholding at output
still strange learning prediction reach states
in which can recognise a in some positions, but
not at all in others
also, amount of training needed in each position
is exorbitant
fact that can pronounce a in position i does not
help to learn a in position j start from scratch
in each position, each of which is different and
separately learned

21
Connectionism Compositionality

Principle
with PDP nets, contextual influence inherent,
compositionality the exception
with symbol systems, compositionality inherent,
contextual influence the exception
in some respects neural nets generalise well, but
in other respects generalise badly.
appropriate global regularities across patterns
extracted (similar patterns treated similarly)
inappropriate with slot coding, component
representations not reused

22
Connectionism Compositionality

alternative connectionist models may do better,
but not clear that any is truly systematic in
sense of symbolic processing
alternative approaches,
localist models, e.g. Interactive Activation or
Activation Gradient models
OReillys spatial invariance model of word
reading?
Elman nets recurrence for learning sequences.

23
References

Anderson, J. R. (1993). Rules of the Mind.
Hillsdale, NJ Erlbaum.
Bowers, J. S. (2002). Challenging the widespread
assumption that connectionism and distributed
representations go hand-in-hand. Cognitive
Psychology., 45, 413-445.
Evans, J. S. B. T. (2003). In Two Minds Dual
Process Accounts of Reasoning. Trends in
Cognitive Sciences, 7(10), 454-459.
Fodor, J. A., Pylyshyn, Z. W. (1988).
Connectionism and Cognitive Architecture A
Critical Analysis. Cognition, 28, 3-71.
Hinton, G. E. (1990). Special Issue of Journal
Artificial Intelligence on Connectionist Symbol
Processing (edited by Hinton, G.E.). Artificial
Intelligence, 46(1-4).
O'Reilly, R. C., Munakata, Y. (2000).
Computational Explorations in Cognitive
Neuroscience Understanding the Mind by
Simulating the Brain. MIT Press.
McClelland, J. L. (1992). Can Connectionist
Models Discover the Structure of Natural
Language? In R. Morelli, W. Miller Brown, D.
Anselmi, K. Haberlandt D. Lloyd (Eds.), Minds,
Brains and Computers Perspectives in Cognitive
Science and Artificial Intelligence (pp.
168-189). Norwood, NJ. Ablex Publishing Company.
McClelland, J. L. (1995). A Connectionist
Perspective on Knowledge and Development. In J.
J. Simon G. S. Halford (Eds.), Developing
Cognitive Competence New Approaches to Process
Modelling (pp. 157-204). Mahwah, NJ Lawrence
Erlbaum.
Page, M. P. A. (2000). Connectionist Modelling in
Psychology A Localist Manifesto. Behavioral and
Brain Sciences, 23, 443-512.
Pinker, S., Ullman, M. T., McClelland, J. L.,
Patterson, K. (2002). The Past-Tense Debate
(Series of Opinion Articles). Trends Cogn Sci,
6(11), 456-474.