Title: Outline
1Outline
- Announcement
- Midterm review
- Widrow-Hoff learning - continued
2Announcement
- The first midterm exam will be on this coming
Monday, Oct. 4, 2004 - Please come to class a bit earlier so that we can
start on time - I will be here about 10 minutes before class
- When most of you are present, we will start so
that you can have some extra time if you need - You need to bring a calculator
3A Single-Input Neuron
Note that w and b are adjustable parameters of
the neurons w is called weight and b is called
bias
4McCulloch and Pitts Model
5McCulloch and Pitts Model cont.
6A Single-Input Neuron cont.
- Transfer functions
- They can be linear or nonlinear
- Commonly used transfer functions
- Hard limit transfer function
- Linear transfer function
- Sigmoid function
7A Single-Input Neuron cont.
- Transfer functions
- a f(n)
8A Single-Input Neuron cont.
- Hard limit transfer function
- Also known as the step function
- Symmetrical hard limit transfer function
9A Single-Input Neuron cont.
10A Single-Input Neuron cont.
- Log-sigmoid transfer function
11Multiple-Input Neuron
12Multiple-Input Neuron cont.
13Neural Network Architecture
- A layer of neurons
- Consists of a set of neurons
- The inputs are connected to each of the neurons
14A Layer of Neurons
15A Layer of Neurons
16Multiple Layers of Neurons
17Multiple Layers of Neurons
18Multiple Layers of Neurons cont.
- Multiple layer neural networks
- A network with several layers
- Multi-layer networks are more powerful than
single-layer networks with nonlinear transfer
functions
19Neural Network Architecture cont.
20Neural Network Architecture cont.
- Recurrent neural networks
21Network Architecture cont.
- How to determine a network architecture
- Number of network inputs is given by the number
of problem inputs - Number of neurons in output layer is given by the
number of problem outputs - The transfer function is partly determined by the
output specifications
22Perceptron Architecture
23Single-Neuron Perceptron
- The output of a single-neuron perceptron is given
by a hardlim(w1,1p1 w1,2p2b)
24Decision Boundary
25Single-Neuron Perceptron cont.
- Design a perceptron graphically
- Select a decision boundary
- Choose a weight vector that is orthogonal to the
decision boundary - Find the bias b
- When there are multiple decision boundaries,
which one is the best?
26Learning Rules
- A learning rule is a procedure for modifying the
weights and biases of a neural network - It is often called a training algorithm
- The purpose of the learning rule is to train the
network to perform some task
27Learning Rules cont.
- Three categories of learning algorithms
- Supervised learning
- Training set p1, t1, p2, t2, .., pQ, tQ
- Here tq is called the target for input pq
- Reinforcement learning
- No correct output is provided for each network
input but a measure of the performance is
provided - It is not very common
- Unsupervised learning
- There are no target outputs available
28Perceptron Learning Rule cont.
29Multiple-Neuron Perceptrons
To update the ith row of the weight matrix
Matrix form
30Apple/Banana Example
Training Set
Initial Weights
First Iteration
31Second Iteration
32Check
33Perceptron Learning Rule cont.
- Perceptron convergence theorem
- Given that solution exists, the perceptron
learning rule converges to a solution in a finite
number of steps
34Limitations of Perceptrons
- Linear separability
- Perceptron networks can only solve problems that
are linearly separable, which means the decision
boundary is linear - Many problems that are not linear separable
- XOR is a well-know example
35Linearly Inseparable Problems
36Hebbs Postulate
- The Hebbs postulate
- When an axon of cell A is near enough to excite a
cell B and repeatedly or persistently takes part
in firing it, some growth process or metabolic
change takes place in one or both cells such that
As efficiency, as one of the cells firing B, is
increased
37Hebb Rule
- If two neurons on either side of a synapse are
activated simultaneously, the strength of the
synapse will increase
38Associate Memory
- Associative memory problem
- The task is to learn Q pairs of prototype
input/output vectors - p1, t1, p2, t2, ....., pQ, tQ
- In other words, if the network receives an input
p pq, then it should produce an output a tq - In addition, for an input similar to a prototype
the network should produce an output similar to
the corresponding output
39Linear Associator cont.
40Linear Associator cont.
- Hebb rule for the linear associator
41Batch Operation
(Zero Initial Weights)
Matrix Form
42Performance Analysis
Case I, input patterns are orthonormal.
Therefore the network output equals the target
Case II, input patterns are normalized, but not
orthogonal.
Error
Case III, input patterns are orthogonal but not
normalized
43Pseudoinverse Rule
Performance Index
Matrix Form
44Pseudoinverse Rule cont.
Minimize
If an inverse exists for P, F(W) can be made zero
When an inverse does not exist F(W) can be
minimized using the pseudo-inverse
45Relationship to the Hebb Rule
Hebb Rule
Pseudoinverse Rule
If the prototype patterns are orthonormal
46Autoassociative Memory
- Autoassociative memory tries to memorize the
input patterns - The desired output is the input vector
- We focus on binary patterns consisting of 1s and
-1s
47Hebb Rule for Autoassociative Memory
(Zero Initial Weights)
Matrix Form
48Autoassociative Memory cont.
- Linear autoassociator
- For orthogonal patterns with elements to be 1 or
-1, the prototypes are the eigenvectors of the
weight matrix given by the Hebb rule - The given pattern is not recalled correctly but
is amplified by a constant factor
49Autoassociative Memory cont.
50Autoassociative Memory
51Tests
50 Occluded
67 Occluded
Noisy Patterns (7 pixels)
52Hopfield Network
- A closely related neural network is Hopfield
Network - It is a recurrent network
- It is more powerful than the one-layer
feed-forward networks as associative memory
53Hopfield Network Architecture cont.
- Here each neuron is a simple perceptron neuron
with - the hardlims transfer function
54Hopfield Network Architecture
- It is a single-layer recurrent network
- Each neuron is a perceptron unit with a hardlims
transfer function
55Hopfield Network as Associative Memory
- One pattern p1
- The condition for the pattern to be stable is
- This can be satisfied by
56Hopfield Network cont.
- Three cases when presenting a pattern p to the
Hopfield network stored only one pattern p1 - P1 will be recalled perfectly if h(p, p1) lt R/2
- -P1 will be recalled if h(p,p1) gt R/2
- What will happen if h(p, p1) R/2?
- Here h(p, p1) is the hamming distance between p
and p1
57Hopfield Network as Associative Memory
- Many patterns
- Matrix form
58Multilayer Perceptron
R S1 S2 S3 Network
59Example
60Total Network
61Function Approximation Example
Nominal Parameter Values
62Parameter Variations
63Multiple Layer Networks
- Note that nonlinearity is essential for multiple
layer neural networks - A multiple layer network with linear transfer
functions is equivalent to a single layer network - Why?
64Multilayer Network
65Performance Index
Training Set
Mean Square Error
Vector Case
Approximate Mean Square Error (Single Sample)
Approximate Steepest Descent
66Steepest Descent
67Jacobian Matrix
68Backpropagation (Sensitivities)
The sensitivities are computed by starting at the
last layer, and then propagating backwards
through the network to the first layer.
69Initialization (Last Layer)
70Derivative of Transfer Functions cont.
- Three commonly used transfer functions with
backpropagation - Linear
- Log-sigmoid
- Hyperbolic tangent sigmoid
71Derivative of Transfer Functions cont.
72Derivative of Transfer Functions cont.
- Hyperbolic tangent sigmoid
73Backpropagation Algorithm Summary
Forward Propagation
Backpropagation
Weight Update
74Example Function Approximation
t
-
p
e
a
1-2-1 Network
75Network
76Initial Conditions
77Forward Propagation
78Transfer Function Derivatives
79Backpropagation
80Weight Update