Secrets of Neural Network Models - PowerPoint PPT Presentation

1 / 174
About This Presentation
Title:

Secrets of Neural Network Models

Description:

How does the gray glop in your head give rise to cognition? ... Learning rules: How to adjust weights based on local information (presynaptic ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 175
Provided by: kenn160
Category:

less

Transcript and Presenter's Notes

Title: Secrets of Neural Network Models


1
Secrets of Neural Network Models
Note These slides have been provided online for
the convenience of students attending the 2003
Merck summer school, and for individuals who have
explicitly been given permission by Ken Norman.
Please do not distribute these slides to third
parties without permission from Ken (which is
easy to get just email Ken at knorman_at_princeton.e
du).
  • Ken Norman
  • Princeton University
  • July 24, 2003

2
  • The Plan, and Acknowledgements
  • The Plan
  • I will teach you all of the the secrets of neural
    network models in 2.5 hours
  • Lecture for the first half
  • Hands-on workshop for the second half
  • Acknowledgements
  • Randy OReilly
  • my lab Greg Detre, Ehren Newman, Adler Perotte,
    and Sean Polyn

3
  • The Big Question
  • How does the gray glop in your head give rise to
    cognition?
  • We know a lot about the brain, and we also know a
    lot about cognition
  • The real challenge is to bridge between these two
    levels

4
  • Complexity and Levels of Analysis
  • The brain is very complex billions of neurons,
    trillions of synapses, all changing every
    nanosecond
  • Each neuron is a very complex entity unto itself
  • We need to abstract away from this complexity!
  • Is there some simpler, higher level for
    describing what the brain does during cognition?

5
  • We want to draw on neurobiology for ideas about
    how the brain performs a particular kind of task
  • Our models should be consistent with what we know
    about how the brain performs the task
  • But at the same time, we want to include only
    aspects of neurobiology that are essential for
    explaining task performance

6
  • Learning and Development
  • Neural network models provide an explicit,
    mechanistic account of how the brain changes as a
    function of experience
  • Goals of learning
  • To acquire an internal representation (a model)
    of the world that allows you to predict what will
    happen next, and to make inferences about
    unseen aspects of the environment
  • The system must be robust to noise/degradation/dam
    age
  • Focus of workshop Use neural networks to
    explore how the brain meets these goals

7
  • Outline of Lecture
  • What is a neural network?
  • Principles of learning in neural networks
  • Hebbian learning Simple learning rules that are
    very good at extracting the statistical structure
    of the environment (i.e., what things are there
    in the world, and how are they related to one
    another)
  • Shortcomings of Hebbian learning Its good at
    acquiring coarse category structure (prototypes)
    but its less good at learning about atypical
    stimuli and arbitrary associations
  • Error-driven learning Very powerful rules that
    allow networks to learn from their mistakes

8
  • Outline, Continued
  • The problem of interference in neocortical
    networks, and how the hippocampus can help
    alleviate this problem
  • Brief discussion of PFC and how networks can
    support active maintenance in the face of
    distracting information
  • Background information for the hands-on portion
    of the workshop

9
  • Overall Philosophy
  • The goal is to give you a good set of intuitions
    for how neural networks function
  • I will simplify and gloss over lots of things.
  • Please ask questions if you dont understand what
    Im saying...

10
What is a neural network?
  • Neurons measure how much input they receive from
    other neurons they fire (send a signal) if
    input exceeds a threshold value
  • Input is a function of firing rate and connection
    strength
  • Learning in neural networks involves adjusting
    connection strength

11
What is a neural network?
  • Key simplifications
  • We reduce all of the complexity of neuronal
    firing to a single number, the activity of the
    neuron, that reflects how often the neuron is
    spiking
  • We reduce all of the complexity of synaptic
    connections between neurons to a single number,
    the synaptic weight, that reflects how strong the
    connection is

12
  • Neurons are Detectors
  • Each neuron is detecting some set of conditions
    (e.g., smoke detector). Representation is what
    is detected.

13
Understanding Neural Components in Terms of the
Detector Model
14
  • Detector Model
  • Neurons feed on each others outputs layers of
    ever more complicated detectors
  • Things can get very complex in terms of content,
    but each neuron is still carrying out the basic
    detector function

15
Two-layer Attractor Networks
Hidden Layer (Internal Representation)
Input/Output Layer
  • Model of processing in neocortex
  • Circles units (neurons) lines connections
    (synapses)
  • Unit brightness activity line thickness
    synaptic weight
  • Connections are symmetric

16
Two-layer Attractor Networks
I
Hidden Layer (Internal Representation)
Input/Output Layer
  • Units within a layer compete to become active.
  • Competition is enforced by inhibitory
    interneurons that sample the amount of activity
    in the layer and send back a proportional amount
    of inhibition
  • Inhibitory interneurons prevent epilepsy in the
    network
  • Inhibitory interneurons are not pictured in
    subsequent diagrams

17
Two-layer Attractor Networks
I
Hidden Layer (Internal Representation)
Input/Output Layer
  • These networks are capable of sustaining a stable
    pattern of activity on their own.
  • Attractor a fancy word for stable pattern of
    activity
  • Real networks are much larger than this, also gt 1
    unit is active in the hidden layer...

18
  • Properties of Two-Layer Attractor Networks
  • I will show that these networks are capable of
    meeting the learning goals outlined
  • Given partial information (e.g., seeing something
    that has wings and features), the networks can
    make a guess about other properties of that
    thing (e.g., it probably flies)
  • Networks show graceful degradation

19
Pattern Completion in two layer networks
wings
beak
feathers
flies
20
Pattern Completion in two layer networks
wings
beak
feathers
flies
21
Pattern Completion in two layer networks
wings
beak
feathers
flies
22
Pattern Completion in two layer networks
wings
beak
feathers
flies
23
Networks are Robust to Damage, Noise
wings
beak
feathers
flies
24
Networks are Robust to Damage, Noise
wings
feathers
flies
25
Networks are Robust to Damage, Noise
wings
feathers
flies
26
Networks are Robust to Damage, Noise
wings
feathers
flies
27
Networks are Robust to Damage, Noise
wings
feathers
flies
28
  • Learning Overview
  • Learning changing connection weights
  • Learning rules How to adjust weights based on
    local information (presynaptic and postsynaptic
    activity) to produce appropriate network behavior
  • Hebbian learning building a statistical model of
    the world, without an explicit teacher...
  • Error-driven learning rules that detect
    undesirable states and change weights to
    eliminate these undesirable states...

29
  • Building a Statistical Model of the World
  • The world is inhabited by things with relatively
    stable sets of features
  • We want to wire detectors in our brains to detect
    these things. How can we do this?
  • Answer Leverage correlation
  • The features of a particular thing tend to appear
    together, and to disappear together a thing is
    nothing more than a correlated cluster of
    features
  • Learning mechanisms that are sensitive to
    correlation will end up representing useful things

30
  • Hebbian Learning
  • How does the brain learn about correlations?
  • Donald Hebb proposed the following mechanism
  • When the pre-synaptic neuron and post-synaptic
    neuron are active at the same time, strengthen
    the connection between them
  • neurons that fire together, wire together

31
Hebbian Learning
32
Hebbian Learning
33
Hebbian Learning
34
  • Hebbian Learning
  • Proposed by Donald Hebb
  • When the pre-synaptic (sending) neuron and
    post-synaptic (receiving) neuron are active at
    the same time, strengthen the connection between
    them
  • neurons that fire together, wire together
  • When two neurons are connected, and one is active
    but the other is not, reduce the connections
    between them
  • neurons that fire apart, unwire

35
Hebbian Learning
36
Hebbian Learning
37
Biology of Hebbian Learning NMDA-Mediated
Long-Term Potentiation
38
  • Biology of Hebbian Learning
  • Long-Term Depression
  • When the postsynaptic neuron is depolarized, but
    presynaptic activity is relatively weak, you get
    weakening of the synapse

39
  • What Does Hebbian Learning Do?
  • Hebbian learning tunes units to represent
    correlated sets of input features.
  • Here is why
  • Say that a unit has 1,000 inputs
  • In this case, turning on and off a single input
    feature wont have a big effect on the units
    activity
  • In contrast, turning on and off a large cluster
    of 900 input features will have a big effect on
    the units activity

40
Hebbian Learning
41
Hebbian Learning
42
Hebbian Learning
  • Because small clusters of inputs do not reliably
    activate the receiving unit, the receiving unit
    does not learn much about these inputs

43
Hebbian Learning
44
Hebbian Learning
45
Hebbian Learning
46
Hebbian Learning
Big clusters of inputs reliably activate the
receiving unit, so the network learns more about
big (vs. small) clusters (the gang effect).
47
Hebbian Learning
Big clusters of inputs reliably activate the
receiving unit, so the network learns more about
big (vs. small) clusters (the gang effect).
48
  • What Does Hebbian Learning Do?
  • Hebbian learning finds the thing in the world
    that most reliably activates the unit, and tunes
    the unit to like that thing even more!

49
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
50
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
51
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
52
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
53
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
54
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
55
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
56
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
57
Hebbian Learning
scaly
slithers
wings
beak
feathers
flies
58
  • What Does Hebbian Learning Do?
  • Hebbian learning finds the thing in the world
    that most reliably activates the unit, and tunes
    the unit to like that thing even more!
  • The outcome of Hebbian learning is a function of
    how well different inputs activate the unit, and
    how frequently they are presented

59
  • Self-Organizing Learning
  • One detector can only represent one thing (i.e.,
    pattern of correlated features)
  • Goal We want to present input patterns to the
    network and have different units in the network
    specialize for different things, such that each
    thing is represented by at least one unit
  • Random weights (different initial receptive
    fields) and competition are important for
    achieving this goal
  • What happens without competition ...

60
No Competition
lives under water
scaly
slithers
wings
beak
feathers
flies
61
No Competition
lives under water
scaly
slithers
wings
beak
feathers
flies
62
No Competition
lives under water
scaly
slithers
wings
beak
feathers
flies
63
No Competition
lives under water
scaly
slithers
wings
beak
feathers
flies
64
No Competition
lives under water
scaly
slithers
wings
beak
feathers
flies
65
No Competition
wings
beak
feathers
flies
scaly
slithers
lives under water
Without competition, all units end up
representing the same gang of features other,
smaller correlations get ignored
66
Competition is important
lives under water
scaly
slithers
wings
beak
feathers
flies
67
Competition is important
lives under water
scaly
slithers
wings
beak
feathers
flies
68
Competition is important
inhibition
lives under water
scaly
slithers
wings
beak
feathers
flies
69
Competition is important
inhibition
lives under water
scaly
slithers
wings
beak
feathers
flies
70
Competition is important
lives under water
scaly
slithers
wings
beak
feathers
flies
71
Competition is important
lives under water
scaly
slithers
wings
beak
feathers
flies
72
Competition is important
lives under water
scaly
slithers
wings
beak
feathers
flies
73
Competition is important
When units have different initial receptive
fields and they compete to represent input
patterns, units end up representing different
things
74
  • Hebbian Learning Summary
  • Hebbian learning finds the thing in the world
    that most reliably activates the unit, and tunes
    the unit to like that thing even more
  • When
  • There are multiple hidden units competing to
    represent input patterns
  • Each hidden unit starts out with a distinct
    receptive field
  • Then
  • Hebbian learning will tune these units so that
    each thing in the world (i.e., each cluster of
    correlated features) is represented by at least
    one unit

75
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
76
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
77
Problems with Penguins
inhibition
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
78
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
79
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
80
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
81
Problems with Penguins
inhibition
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
82
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
83
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
84
Problems with Penguins
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
85
  • Problems with Hebb, and Possible Solutions
  • Self-organizing Hebbian learning is capable of
    discovering the high-level (coarse) categorical
    structure of the inputs
  • However, it sometimes collapses across more
    subtle (but important) distinctions, and the
    learning rule does not have any provisions for
    fixing these errors once they happen

86
  • Problems with Hebb, and Possible Solutions
  • In the penguin problem, if we want the network to
    remember that typical birds fly, but penguins
    dont, then penguins and typical birds need to
    have distinct (non-identical) hidden
    representations
  • Hebbian learning assigns the same hidden unit to
    penguins and typical birds
  • We need to supplement Hebbian learning with
    another learning rule that is sensitive to when
    the network makes an error (e.g., saying that
    penguins fly) and corrects the error by pulling
    apart the hidden representations of penguins vs.
    typical birds.

87
  • What is an error, exactly?
  • One common way of conceptualizing error is in
    terms of predictions and outcomes
  • If you give the network a partial version of a
    studied pattern, the network will make a
    prediction as to the missing features of that
    pattern (e.g., given something that has
    feathers, the network will guess that it
    probably flies)
  • Later, you learn what the missing features are
    (the outcome). If the networks guess about the
    missing features is wrong, we want the network to
    be able to change its weights based on the
    difference between the prediction and the
    outcome.
  • Today, I will present the GeneRec error-driven
    learning rule developed by Randy OReilly.

88
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
89
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
90
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
91
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
92
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
93
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
94
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.
  • Outcome phase
  • Present the full pattern
  • Let the network settle

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
95
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.
  • Outcome phase
  • Present the full pattern
  • Let the network settle

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
96
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.
  • Outcome phase
  • Present the full pattern
  • Let the network settle

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
97
Error-Driven Learning
  • Prediction phase
  • Present a partial pattern
  • The network makes a guess about the missing
    features.
  • Outcome phase
  • Present the full pattern
  • Let the network settle

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
98
Error-Driven Learning
  • We now need to compare these two activity
    patterns and figure out which weights to change.

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
99
  • Motivating the Learning Rule
  • The goal of error-driven learning is to discover
    an internal representation for the item that
    activates the correct answer.
  • Basically, we want to find hidden units that are
    associated with the correct answer (in this case,
    waddles).
  • The best way to do this is to examine how
    activity changes when waddles is clamped on
    during the outcome phase.
  • Hidden units that are associated with waddles
    should show an increase in activity in the
    outcome (vs. prediction) phase.
  • Hidden units that are not associated with
    waddles should show a decrease in activity in
    the outcome phase (because of increased
    competition from other units that are associated
    with waddle).

100
  • Motivating the Learning Rule
  • Hidden units that are associated with waddle
    should show an increase in activity in the
    outcome (vs. prediction) phase.
  • Hidden units that are not associated with
    waddle should show a decrease in activity in
    the outcome phase
  • Here is the learning role
  • If a hidden unit shows increased activity (i.e.,
    its associated with the correct answer),
    increase its weights to the input pattern
  • If a hidden unit should decreased activity (i.e.,
    its not associated with the correct answer),
    reduce its weights to the input pattern

101
Error-Driven Learning
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
102
Error-Driven Learning
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
103
Error-Driven Learning
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
104
Error-Driven Learning
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
105
Error-Driven Learning
  • Hebb and error have opposite effects on weights
    here!
  • Error increases the extent to which penguin is
    linked to the right-hand unit, whereas Hebb
    reinforced penguins tendency to activate the
    left-hand unit

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
106
Error-Driven Learning
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
107
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
108
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
109
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
110
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
111
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
112
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
113
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
114
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
115
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
116
Error-Driven Learning
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
117
  • Catastrophic Interference
  • If you change the weights too strongly in
    response to penguin, then the network starts to
    behave like all birds waddle. New learning
    interferes with stored knowledge...
  • The best way to avoid this problem is to make
    small weight changes, and to interleave penguin
    learning trials with typical bird trials
  • The typical bird trials serve to remind the
    network to retain the association between
    wings/feathers/beak and flies...

118
Interleaved Training
slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
119
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
120
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
121
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
122
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
123
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
124
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
125
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
126
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
127
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
128
Interleaved Training
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
129
  • Gradual vs. One-Trial Learning
  • Problem It appears that the solution to the
    catastrophic interference problem is to learn
    slowly.
  • But we also need to be able to learn quickly!

130
  • Gradual vs. One-Trial Learning
  • Put another way There appears to be a trade-off
    between learning rate and interference in the
    cortical network
  • Our claim is that the brain avoids this trade-off
    by having two separate networks
  • A slow-learning cortical network that gradually
    develops internal representations that support
    generalization, prediction, categorization, etc.
  • A fast-learning hippocampal network that is
    specialized for rapid memorization (but does not
    support generalization, categorization, etc.)

131
hippo- campus
CA3
CA1
Dentate Gyrus
Entorhinal Cortex input
Entorhinal Cortex output
neo- cortex
lower-level cortex
132
  • Interactions Between Hippo and Cortex
  • According to the Complementary Learning Systems
    theory (McClelland et al., 1995), hippocampus
    rapidly memorizes patterns of cortical activity.
  • The hippocampus manages to learn rapidly without
    suffering catastrophic interference because it
    has a built-in tendency to assign distinct,
    minimally overlapping representations to input
    patterns, even when they are very similar. Of
    course this hurts its ability to categorize.

133
  • Interactions Between Hippo and Cortex
  • The theory states that, when you are asleep, the
    hippocampus plays back stored patterns in an
    interleaved fashion, thereby allowing cortex to
    weave new facts and experiences into existing
    knowledge structures.
  • Even if something just happens once in the real
    world, hippocampus can keep re-playing it to
    cortex, interleaved with other events, until it
    sinks in...
  • Detailed theory
  • slow-wave sleep hippo playback to cortex
  • REM sleep cortex randomly activates stored
    representations this strengthens pre-existing
    knowledge and protects it against interference

134
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
135
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
136
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
137
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
138
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
139
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
140
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
141
Role of the Hippocampus
hippocampus
slithers
lives in Antarctica
wings
beak
feathers
waddles
flies
142
  • Error-Driven Learning Summary
  • Error-driven learning algorithms are very
    powerful So long as the learning rate is small,
    and training patterns are presented in an
    interleaved fashion, algorithms like GeneRec can
    learn internal representations that support good
    pattern completion of missing features.
  • Error-driven learning is not meant to be a
    replacement for Hebbian learning The two
    algorithms can co-exist!
  • Hebbian learning actually improves the
    performance of GeneRec by ensuring that hidden
    units represent meaningful clusters of features

143
  • Error-Driven Learning Summary
  • Theoretical issues to resolve with error-driven
    learning The algorithm requires that the network
    know whether you are in a prediction phase or
    an outcome phase, how does the network know
    this?
  • For that matter, the whole phases idea is
    sketchy
  • GeneRec based on prediction/outcome differences
    is not the only way to do error-driven
    learning...
  • Backpropagation
  • Learning by reconstruction
  • Adaptive Resonance Theory (Grossberg Carpenter)

144
  • Learning by Reconstruction
  • Instead of doing error-driven learning by
    comparing predictions and outcomes, you can also
    do error-driven learning as follows
  • First, you clamp the correct, full pattern onto
    the network and let it settle.
  • Then, you erase the input pattern and see whether
    the network can reconstruct the input pattern
    based on its internal representation
  • The algorithm is basically the same, you are
    still comparing two phases...

145
Learning by Reconstruction
  • Clamp the to-be-learned pattern onto the input
    and let the network settle

slithers
lives in Antarctica
wad- dles
wings
beak
feathers
flies
146
Learning by Reconstruction
  • Clamp the to-be-learned pattern onto the input
    and let the network settle
  • Next, wipe the input layer clean (but not the
    hidden layer) and let the network settle

slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
147
Learning by Reconstruction
  • Clamp the to-be-learned pattern onto the input
    and let the network settle
  • Next, wipe the input layer clean (but not the
    hidden layer) and let the network settle

slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
148
Learning by Reconstruction
  • Clamp the to-be-learned pattern onto the input
    and let the network settle
  • Next, wipe the input layer clean (but not the
    hidden layer) and let the network settle

slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
149
Learning by Reconstruction
  • Clamp the to-be-learned pattern onto the input
    and let the network settle
  • Next, wipe the input layer clean (but not the
    hidden layer) and let the network settle

slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
150
Learning by Reconstruction
  • Compare hidden activity in the two phases and
    adjust weights accordingly (i.e., if activation
    was higher with the correct answer clamped,
    increase weights if activation was lower,
    decrease wts)

slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
151
Learning by Reconstruction
  • Compare hidden activity in the two phases and
    adjust weights accordingly (i.e., if activation
    was higher with the correct answer clamped,
    increase weights if activation was lower,
    decrease wts)

slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
slithers
lives in Antarctica
wings
beak
feathers
flies
wad- dles
152
Adaptive Resonance Theory
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
153
Adaptive Resonance Theory
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
154
Adaptive Resonance Theory
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
155
Adaptive Resonance Theory
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
156
Adaptive Resonance Theory
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
157
Adaptive Resonance Theory
MISMATCH!
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
158
Adaptive Resonance Theory
MISMATCH!
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
159
Adaptive Resonance Theory
MISMATCH!
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
160
Adaptive Resonance Theory
MISMATCH!
slithers
lives in Antarctica
waddles
wings
beak
feathers
flies
161
Spreading Activation vs. Active Maintenance
  • Spreading activation is generally very useful...
    it lets us make predictions/inferences/etc.
  • But sometimes you just want to hold on to a
    pattern of activation without letting activation
    spread (e.g., a phone number, or a persons
    name).
  • How do we maintain specific patterns of activity
    in the face of distraction?

162
Spreading Activation vs. Active Maintenance
  • As you will see in the hands-on part of the
    workshop, the networks we have been discussing
    are not very robust to noise/distraction.
  • Thus, there appears to be another tradeoff
  • Networks that are good at generalization/predictio
    n are lousy at holding on to phone
    numbers/plans/ideas in the face of distraction

163
Spreading Activation vs. Active Maintenance
  • Solution We have evolved a network that is
    optimized for active maintenance Prefrontal
    cortex! This complements the rest of cortex,
    which is good at generalization but not so good
    at active maintenance.
  • PFC uses isolated representations to prevent
    spread of activity...
  • Evidence for isolated stripes in PFC

164
Tripartite Functional Organization
  • PC posterior perceptual motor cortex
  • FC prefrontal cortex
  • HC hippocampus and related structures

165
Tripartite Functional Organization
  • PC incremental learning about the structure of
    the environment
  • FC active maintenance, cognitive control
  • HC rapid memorization
  • Roles are defined by functional tradeoffs

166
Key Trade-offs
  • Extracting what is generally true (across events)
    vs. memorizing specific events
  • Inference (spreading activation) vs. robust
    active maintenance

167
Hands-On Exercises
  • The goal of the hands-on part of the workshop is
    to get a feel for the kinds of representations
    that are acquired by Hebbian vs. error-driven
    learning, and for network dynamics more generally.

168
  • Here is the network that we will be using
  • Activity constraints Only 10 of hidden units
    can be strongly active at once in the input
    layer, only one unit per row
  • Think of each row in the input as a feature
    dimension (e.g., shape) and the units in that row
    are mutually exclusive features along that
    dimension (square, circle, etc.)

169
  • This diagram illustrates the connectivity of the
    network
  • Each hidden unit is connected to 50 of the input
    units there are also recurrent connections from
    each hidden unit to all of the other hidden units
  • Weights are symmetric
  • Initial weight values were set randomly

170
  • I trained up the network on the following 8
    patterns

Typical Bird Number 1
Typical Bird Number 2
Typical Fish Number 2
Typical Fish Number 1
Typical Bird Number 3
Atypical Bird (duck)
Atypical Fish (flying fish)
Typical Fish Number 3
  • In each pattern, the bottom 16 rows encode
    prototypical features that tend to be shared
    across patterns within a category the top 8 rows
    encode item-specific features that are unique to
    each pattern.
  • Each category has 3 typical items and one
    atypical item
  • During training, the network studied typical
    patterns 90 of the time and it studied atypical
    patterns 10 of the time

171
  • To save time, the networks you will be using have
    been pre-trained on the 8 patterns (by presenting
    them repeatedly, in an interleaved fashion)
  • For some of the simulations, you will be using a
    network that was trained with (purely) Hebbian
    learning

172
  • For other simulations, you will be using a
    network that was trained with a combination of
    error-driven (GeneRec) and Hebbian learning.
    Training of this network use a three-phase
    design
  • First, there was a prediction (minus) phase
    where a partial pattern was presented
  • Second, there was an outcome (plus) phase where
    the full version of the pattern was presented
  • Finally, there was a nothing phase where the
    input pattern was erased (but not the hidden
    pattern)
  • Error-driven learning occurred based on the
    difference in activity between the minus and plus
    patterns, and based on the differenced in
    activity between the plus and nothing patterns

173
  • When you get to the computer room, the simulation
    should already be open on the computer (some of
    you may have to double-up, I think there are
    slightly fewer computers than students) and there
    will be a handout on the desk explaining what to
    do
  • You can proceed at your own pace
  • I will be there to answer questions (about the
    lecture and about the computer exercises) and my
    two grad students Ehren Newman and Sean Polyn
    will also be there to answer questions.

174
Your Helpers
Ehren Sean
me
Write a Comment
User Comments (0)
About PowerShow.com