2L490 CNperceptrons 1

About This Presentation

Title:

2L490 CNperceptrons 1

Description:

[ 0,1]m ,when the sigmoid activation function is used. a function Rn ! ... (incremental version, sigmoid transfer function) 9/22/09. Rudolf Mak TU/e Computer Science ... – PowerPoint PPT presentation

Number of Views:60

Avg rating:3.0/5.0

Slides: 24

Provided by: Rudol

Category:

more less

Transcript and Presenter's Notes

Title: 2L490 CNperceptrons 1

1
Disadvantages of Discrete Neurons

Only boolean valued functions can be computed
A simple learning algorithm for multi-layer
discrete-neuron perceptrons is lacking
The computational capabilities of single-layer
discrete-neuron perceptrons is limited
These disadvantages disappear when we
consider multi-layer continuous-neuron
perceptrons

2
Preliminaries

A continuous-neuron perceptron with n input and m
outputs computes
a function Rn ! 0,1m ,when the sigmoid
activation function is used
a function Rn ! Rm ,when a linear activation
function is used
The learning rules for continuous-neuron
perceptrons are based on optimization techniques
for error-functions. This requires a continuous
and differentiable error function.

3
Sigmoid transfer function
4
Computational Capabilities

Let g0,1n!R be a continuous function and let
. Then there exists a two layer
perceptron with
First layer build from neurons with threshold and
standard sigmoid activation function
Second layer build from one neuron without
threshold and linear activation function
such that the function G computed by this network
satis-
fies

5
Single-layer networks

Compute function from Rn to 0, 1m
Sufficient to consider a single neuron
Compute a function f(w0 ?1 j n wjxj )
Assume x0 1 then compute
a function f(?0 j n wjxj )

6
Error function
7
Gradient Descent
8
Update of Weight i by Training Pair q
9
Delta Rule Learning (incremental version,
arbitrary transfer function)
10
Delta Rule Learning (incremental version,
sigmoid transfer function)
11
Delta Rule Learning (incremental version, linear
transfer function)
12
Stopcriteria

The mean square error becomes small enough
The mean square error does not decrease any-
more, i.e. the gradient has become very small or
even changes sign
The maximum number of iterations has been exceeded

13
Remarks

Delta rule learning is also called L(east) M(ean)
S(quare) learning or Widrow Hoff learning
Note that the incremental version of the delta
rule is strictly not a gradient descent
algorithm, because in each step a different error
function E(q) is used
Convergence of the incremental version can only
be guaranteed if the learning parameter a goes to
0 during learning

14
Perceptron Learning Rule (batch version,
arbitrary transfer function)
15
Perceptron Learning Delta Rule (batch version,
sigmoidal transfer function)
16
Perceptron Learning Rule (batch version, linear
transfer function)
17
Convergence of the batch version
For small enough learning parameter the batch
version of the delta rule always converges. The
resulting weights, however, may correspond to a
local minimum of the error function, instead of
the global minimum
18
Linear Neurons and Least Squares
19
Linear Neurons and Least Squares
20
C is non-singular
21
Linear Least Squares Convergence
22
(No Transcript)
23
Linear Least Squares Convergence
24
Find the line
25
Solution

Write a Comment

User Comments (0)