Outline - PowerPoint PPT Presentation

1 / 80
About This Presentation
Title:

Outline

Description:

The first midterm exam will be on this coming Monday, Oct. 4, 2004 ... Sigmoid function. 9/26/09. CAP 5615: Introduction to Neural Networks. 7 ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 81
Provided by: xiuwe
Category:
Tags: outline | sigmoid

less

Transcript and Presenter's Notes

Title: Outline


1
Outline
  • Announcement
  • Midterm review
  • Widrow-Hoff learning - continued

2
Announcement
  • The first midterm exam will be on this coming
    Monday, Oct. 4, 2004
  • Please come to class a bit earlier so that we can
    start on time
  • I will be here about 10 minutes before class
  • When most of you are present, we will start so
    that you can have some extra time if you need
  • You need to bring a calculator

3
A Single-Input Neuron
Note that w and b are adjustable parameters of
the neurons w is called weight and b is called
bias
4
McCulloch and Pitts Model
5
McCulloch and Pitts Model cont.
6
A Single-Input Neuron cont.
  • Transfer functions
  • They can be linear or nonlinear
  • Commonly used transfer functions
  • Hard limit transfer function
  • Linear transfer function
  • Sigmoid function

7
A Single-Input Neuron cont.
  • Transfer functions
  • a f(n)

8
A Single-Input Neuron cont.
  • Hard limit transfer function
  • Also known as the step function
  • Symmetrical hard limit transfer function

9
A Single-Input Neuron cont.
  • Linear transfer function

10
A Single-Input Neuron cont.
  • Log-sigmoid transfer function

11
Multiple-Input Neuron
12
Multiple-Input Neuron cont.
13
Neural Network Architecture
  • A layer of neurons
  • Consists of a set of neurons
  • The inputs are connected to each of the neurons

14
A Layer of Neurons
15
A Layer of Neurons
16
Multiple Layers of Neurons
17
Multiple Layers of Neurons
18
Multiple Layers of Neurons cont.
  • Multiple layer neural networks
  • A network with several layers
  • Multi-layer networks are more powerful than
    single-layer networks with nonlinear transfer
    functions

19
Neural Network Architecture cont.
  • Feed-forward networks

20
Neural Network Architecture cont.
  • Recurrent neural networks

21
Network Architecture cont.
  • How to determine a network architecture
  • Number of network inputs is given by the number
    of problem inputs
  • Number of neurons in output layer is given by the
    number of problem outputs
  • The transfer function is partly determined by the
    output specifications

22
Perceptron Architecture
23
Single-Neuron Perceptron
  • The output of a single-neuron perceptron is given
    by a hardlim(w1,1p1 w1,2p2b)

24
Decision Boundary
25
Single-Neuron Perceptron cont.
  • Design a perceptron graphically
  • Select a decision boundary
  • Choose a weight vector that is orthogonal to the
    decision boundary
  • Find the bias b
  • When there are multiple decision boundaries,
    which one is the best?

26
Learning Rules
  • A learning rule is a procedure for modifying the
    weights and biases of a neural network
  • It is often called a training algorithm
  • The purpose of the learning rule is to train the
    network to perform some task

27
Learning Rules cont.
  • Three categories of learning algorithms
  • Supervised learning
  • Training set p1, t1, p2, t2, .., pQ, tQ
  • Here tq is called the target for input pq
  • Reinforcement learning
  • No correct output is provided for each network
    input but a measure of the performance is
    provided
  • It is not very common
  • Unsupervised learning
  • There are no target outputs available

28
Perceptron Learning Rule cont.
  • Perceptron learning rule

29
Multiple-Neuron Perceptrons
To update the ith row of the weight matrix
Matrix form
30
Apple/Banana Example
Training Set
Initial Weights
First Iteration
31
Second Iteration
32
Check
33
Perceptron Learning Rule cont.
  • Perceptron convergence theorem
  • Given that solution exists, the perceptron
    learning rule converges to a solution in a finite
    number of steps

34
Limitations of Perceptrons
  • Linear separability
  • Perceptron networks can only solve problems that
    are linearly separable, which means the decision
    boundary is linear
  • Many problems that are not linear separable
  • XOR is a well-know example

35
Linearly Inseparable Problems
36
Hebbs Postulate
  • The Hebbs postulate
  • When an axon of cell A is near enough to excite a
    cell B and repeatedly or persistently takes part
    in firing it, some growth process or metabolic
    change takes place in one or both cells such that
    As efficiency, as one of the cells firing B, is
    increased

37
Hebb Rule
  • If two neurons on either side of a synapse are
    activated simultaneously, the strength of the
    synapse will increase

38
Associate Memory
  • Associative memory problem
  • The task is to learn Q pairs of prototype
    input/output vectors
  • p1, t1, p2, t2, ....., pQ, tQ
  • In other words, if the network receives an input
    p pq, then it should produce an output a tq
  • In addition, for an input similar to a prototype
    the network should produce an output similar to
    the corresponding output

39
Linear Associator cont.
40
Linear Associator cont.
  • Hebb rule for the linear associator

41
Batch Operation
(Zero Initial Weights)
Matrix Form
42
Performance Analysis
Case I, input patterns are orthonormal.
Therefore the network output equals the target
Case II, input patterns are normalized, but not
orthogonal.
Error
Case III, input patterns are orthogonal but not
normalized
43
Pseudoinverse Rule
Performance Index
Matrix Form
44
Pseudoinverse Rule cont.
Minimize
If an inverse exists for P, F(W) can be made zero
When an inverse does not exist F(W) can be
minimized using the pseudo-inverse
45
Relationship to the Hebb Rule
Hebb Rule
Pseudoinverse Rule
If the prototype patterns are orthonormal
46
Autoassociative Memory
  • Autoassociative memory tries to memorize the
    input patterns
  • The desired output is the input vector
  • We focus on binary patterns consisting of 1s and
    -1s

47
Hebb Rule for Autoassociative Memory
(Zero Initial Weights)
Matrix Form
48
Autoassociative Memory cont.
  • Linear autoassociator
  • For orthogonal patterns with elements to be 1 or
    -1, the prototypes are the eigenvectors of the
    weight matrix given by the Hebb rule
  • The given pattern is not recalled correctly but
    is amplified by a constant factor

49
Autoassociative Memory cont.
50
Autoassociative Memory
51
Tests
50 Occluded
67 Occluded
Noisy Patterns (7 pixels)
52
Hopfield Network
  • A closely related neural network is Hopfield
    Network
  • It is a recurrent network
  • It is more powerful than the one-layer
    feed-forward networks as associative memory

53
Hopfield Network Architecture cont.
  • Here each neuron is a simple perceptron neuron
    with
  • the hardlims transfer function

54
Hopfield Network Architecture
  • It is a single-layer recurrent network
  • Each neuron is a perceptron unit with a hardlims
    transfer function

55
Hopfield Network as Associative Memory
  • One pattern p1
  • The condition for the pattern to be stable is
  • This can be satisfied by

56
Hopfield Network cont.
  • Three cases when presenting a pattern p to the
    Hopfield network stored only one pattern p1
  • P1 will be recalled perfectly if h(p, p1) lt R/2
  • -P1 will be recalled if h(p,p1) gt R/2
  • What will happen if h(p, p1) R/2?
  • Here h(p, p1) is the hamming distance between p
    and p1

57
Hopfield Network as Associative Memory
  • Many patterns
  • Matrix form

58
Multilayer Perceptron
R S1 S2 S3 Network
59
Example
60
Total Network
61
Function Approximation Example
Nominal Parameter Values
62
Parameter Variations
63
Multiple Layer Networks
  • Note that nonlinearity is essential for multiple
    layer neural networks
  • A multiple layer network with linear transfer
    functions is equivalent to a single layer network
  • Why?

64
Multilayer Network
65
Performance Index
Training Set
Mean Square Error
Vector Case
Approximate Mean Square Error (Single Sample)
Approximate Steepest Descent
66
Steepest Descent
67
Jacobian Matrix
68
Backpropagation (Sensitivities)
The sensitivities are computed by starting at the
last layer, and then propagating backwards
through the network to the first layer.
69
Initialization (Last Layer)
70
Derivative of Transfer Functions cont.
  • Three commonly used transfer functions with
    backpropagation
  • Linear
  • Log-sigmoid
  • Hyperbolic tangent sigmoid

71
Derivative of Transfer Functions cont.
  • Log-sigmoid

72
Derivative of Transfer Functions cont.
  • Hyperbolic tangent sigmoid

73
Backpropagation Algorithm Summary
Forward Propagation
Backpropagation
Weight Update
74
Example Function Approximation
t
-
p
e

a
1-2-1 Network
75
Network
76
Initial Conditions
77
Forward Propagation
78
Transfer Function Derivatives
79
Backpropagation
80
Weight Update
Write a Comment
User Comments (0)
About PowerShow.com