A brief history of connectionism and information processing presentation

About This Presentation

Transcript and Presenter's Notes

Title: A brief history of connectionism and information processing

1
A brief history of connectionism and information
processing

2
The Role of the Brain

3
Basic operation

Intercell communication via the synapse
Can be excitatory (making a receiving neuron more
likely to fire) or inhibitory (making it less
likely to fire)
Typically communicate via neurotransmitters
Released on the axon side and trigger electrical
changes on the dendrite side
Neurologists believed that the basic unit of
information is the rate of firing of a neuron
This is usually discussed in terms of a neurons
activation level

4
Representing info in our wetware

5
Representing info in our wetware, cont.

6
McCulloch Pitts (1943)

They explored the formal properties of
neuron-like devices
What logical operations could neurons compute?
Five assumptions based on then-current knowledge
of neurons
1. The activity of a neuron is all-or-none
(binary coding)
2. Each neuron has a fixed threshold on the
required number of synapses that need to be
excited before the neuron itself will be excited.
Weights are identical.
3. Synaptic action causes a time delay before
firing.
4. Inhibition is absolute.
5. The physical structure of a network of neurons
doesnt change with time connections and their
strengths are static.

7
McCulloch/Pitts neurons

McCulloch/Pitts neurons can then be used to
compute any (finite) logical function
BUT, McCulloch/Pitts networks cant learn.

8
Hebb (1949)

Aimed to set out the psychological implications
of particular neural models also was very
interested in developing a physiological theory
of learning.

9
Learning in a Hebbian network

10
Hebbian learning, more formally

Eq
where as are activation values (-1 or 1), and ??
is a learning rate parameter.
Equation is applied until weights saturate
(typically at 1) and do not keep increasing as
inputs are presented.
Think of Hebbian learning as picking up on
correlations between features in the environment
Features that co-occur will have strong positive
weights, those that do not occur together will
have strong negative weights, random pairing
produces zero weights

11
Perceptron (Rosenblatt, 1958, 1962)

Rosenblatt explored the properties of networks of
McCulloch-Pitts neurons (linear-threshold) with
connections that could be modified by learning

12
Perceptron

Most commonly discussed architecture
Only connections between feature units and output
unit was modifiable (the wis). The input
feature unit values (xi) were set by hand.

w0
wn
13
Multiple Perceptrons
14
How were the connections learned?

Start with random connections
Present an input pattern
Propagate activation through network to the
output.
If output is correct then dont change anything.
If incorrect, then change weights only on
connections between active feature units and the
output units.

15
Change weights how? How much?

Rule
If the output unit is on when it should be off,
then decrease the weights from those active
feature units by some constant amount.
If the output unit is off when it should be on,
then increase the weights from those active
feature units by some constant amount
Perceptron was very powerful method for learning
various relationships.

16
Minsky Papert (1969)

Presented a formal analysis of the properties of
perceptrons and revealed several fundamental
limitations.
Limitations
Cant learn nonlinearly separable problems like
the XOR
More

17
Linearly separable
18
Nonlinearly separable
19
Minsky Papert cont.

Limitations
So.cant learn nonlinearly separable problems
like the XOR
Although including hidden layers allows one to
hand-design a network that can represent XOR and
related problems, they showed that the perceptron
learning rule cant learn the required weights.
They also showed that even those functions that
can be learned by perceptron rule learning may
require huge amounts of learning time

20
Fallout of Minsky Paperts analysis

21
Connectionist (subsymbolic) vs. symbolic
processing

Newell (1980) articulated the role of the
mathematical theory of symbolic processing.
Cognition involves the manipulation of symbols
analogous to words, concepts, schema, etc.
What are symbols?
Definition is hard to pin down.
Roughly, its like the values of a categorical
variable (male, female, red, blue, dog, cat).
Operators on those symbols would then be things
like is-a a-kind-of purpose shape
part-of object

22
McClelland Rumelharts alternative subsymbolic
processing

Cognition involves the spreading of activation,
relaxation, statistical correlation.
Represents a method for how symbolic systems
might be implemented
Hypothesized that apparently symbolic processing
is an emergent property of subsymbolic
operations.
Subsymbolic elements of computation are numbers
Philosophers of mind continue to debate the
distinction between symbolic and subsymbolic and
which is fundamentally correct.

23
Should we toss out symbolic approaches?

No they do offer a different level of analysis
and can be very helpful, especially when your
interest is in high level cognition
Example, do you want to build a connectionist
model of chess playing? Very complex.
But how would you build a symbolic model of
vision?

24
Terms you may encounter

25
Why use connectionist models?

Strong generalization
Fault tolerance
Can be used to model learning
More naturally capture nonlinear relationships
Fuzzy information retrieval
The gap between neural processing and
connectionist models is smaller (but still large)

A brief history of connectionism and information processing PowerPoint PPT Presentation