Title: The Hopfield Network
1The Hopfield Network
- The nodes of a Hopfield network can be updated
synchronously or asynchronously. - Synchronous updating means that at time step
(t1) every neuron is updated based on the
network state at time step t. - In asynchronous updating, a random node k1 is
picked and updated, then a random node k2 is
picked and updated (already using the new value
of k1), and so on. - The synchronous mode can be problematic because
it may never lead to a stable network state.
2Asynchronous Hopfield Network
- Current network state O, attractors (stored
patterns) X and Y
X
O
Y
3Asynchronous Hopfield Network
- After first update, this could happen
X
Y
O
4Asynchronous Hopfield Network
O
X
Y
5Synchronous Hopfield Network
- What happens for synchronous updating?
X
O
Y
6Synchronous Hopfield Network
- Something like shown below. And then?
O
X
Y
7Synchronous Hopfield Network
- The network may oscillate between these two
states forever.
X
O
Y
8The Hopfield Network
- The previous illustration shows that the
synchronous updating rule may never lead to a
stable network state. - However, is the asynchronous updating rule
guaranteed to reach such a state within a finite
number of iterations? - To find out about this, we have to characterize
the effect of the network dynamics more
precisely. - In order to do so, we need to introduce an energy
function.
9The Energy Function
- Updating rule (as used in the textbook)
Often,
10The Energy Function
- Given the way we determine the weight matrix W
(but also for iterative learning methods) , we
expect the weight from node j to node l to be
proportional to
for P stored input patterns. In other words, if
two units are often activated (1) together in
the given input patterns, we expect them to be
connected by large, positive weights. If one of
them is activated whenever the other one is not,
we expect large, negative weights between them.
11The Energy Function
- Since the above formula applies to all weights in
the network, we expect the following expression
to be positive and large for each stored pattern
(attractor pattern)
We would still expect a large, positive value for
those input patterns that are very similar to any
of the attractor patterns. The lower the
similarity, the lower is the value of this
expression that we expect to find.
12The Energy Function
- This motivates the following approach to an
energy function, which we want to decrease with
greater similarity of the networks current
activation pattern to any of the attractor
patterns (similar to the error function in the
BPN)
If the value of this expression is minimized
(possibly by some form of gradient descent along
activation patterns), the resulting activation
pattern will be close to one of the attractors.
13The Energy Function
- However, we do not want the activation pattern to
arbitrarily reach one of the attractor patterns. - Instead, we would like the final activation
pattern to be the attractor that is most similar
to the initial input to the network. - We can achieve this by adding a term that
penalizes deviation of the current activation
pattern from the initial input. - The resulting energy function has the following
form
14The Energy Function
- How does this network energy change with every
application of the asynchronous updating rule?
When updating node k, xj(t1)xj(t) for every
node j?k
15The Energy Function
- Since wk,j wj,k, if we set a 0.5 and b 1 we
get
This means that in order to reduce energy, the
k-th node should change its state if and only if
In other words, the state of a node should change
whenever it differs from the sign of the net
input.
16The Energy Function
- And this is exactly what our asynchronous
updating rule does! - Consequently, every weight update reduces the
networks energy. - By definition, every possible network state
(activation pattern) is associated with a
specific energy. - Since there is a finite number of states that the
network can assume (2n for an n-node network),
and every update leads to a state of lower
energy, there can only be a finite number of
updates.
17The Energy Function
- Therefore, we have shown that the network reaches
a stable state after a finite number of
iterations. - This state is likely to be one of the networks
stored patterns. - It is possible, however, that we get stuck in a
local energy minimum and never reach the absolute
minimum (just like in BPNs). - In that case, the final pattern will usually be
very similar to one of the attractors, but not
identical.