Neural Networks Adaline - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Neural Networks Adaline

Description:

Change Mc-P neurons to Sigmoid etc. Derive backprop using chain rule. ... Replacement of Threshold Neurons with Sigmoid or Differentiable Neurons. Threshold ... – PowerPoint PPT presentation

Number of Views:1541
Avg rating:3.0/5.0
Slides: 15
Provided by: csHai
Category:

less

Transcript and Presenter's Notes

Title: Neural Networks Adaline


1
Neural Networks - Adaline
  • L. Manevitz

2
Plan of Lecture
  • Perceptron Connected and Convex examples
  • Adaline Square Error
  • Gradient
  • Calculate for and, xor
  • Discuss limitations
  • LMS algorithm derivation.

3
What is best weights?
  • Most classified correctly?
  • Least Square Error
  • Least Square Error Before Cut-Off!
  • To Minimize S (d Sw x)2 (first sum over
    examples second over dimension)

4
Least Square Minimization
  • Find gradient of error over all examples. Either
    calculate the minimum or move opposite to
    gradient.
  • Widrow-Hoff(LMS) Use instantaneous example as
    approximation to gradient.
  • Advantages No memory on-line serves similar
    function as noise to avoid local problems.
  • Adjust by w(new) w(old) a (d) x for each x.
  • Here d (desired output Swx)

5
LMS Derivation
  • Errsq S (d(k) W x(k)) 2
  • Grad(errsq) 2S(d(k) W x(k)) (-x(k))
  • W (new) W(old) - m Grad(errsq)
  • To ease calculations, use Err(k) in place of
    Errsq
  • W(new) W(old) 2mErr(k) x(k))
  • Continue with next choice of k

6
Applications
  • Adaline has better convergence properties than
    Perceptron
  • Useful in noise correction
  • Adaline in every modem.

7
LMS (Least Mean Square Alg.)
  • 1. Apply input to Adaline input
  • 2. Find the square error of current input
  • Errsq(k) (d(k) - W x(k))2
  • 3. Approximate Grad(ErrorSquare) by
  • differentiating Errsq
  • approximating average Errsq by Errsq(k)
  • obtain -2Errsq(k)x(k)
  • Update W W(new) W(old) 2mErrsq(k)X(k)
  • Repeat steps 1 to 4.

8
Comparison with Perceptron
  • Both use updating rule changing with each input
  • One fixes binary error the other minimizes
    continuous error
  • Adaline always converges see what happens with
    XOR
  • Both can REPRESENT Linearly separable functions

9
Convergence Phenomenom
  • LMS converges depending on choice of m.
  • How to choose it?

10
Limitations
  • Linearly Separable
  • How can we get around it?
  • Use network of neurons?
  • Use a transformation of data so that it is
    linearly separable

11
Multi-level Neural Networks
  • Representability
  • Arbitrarily complicated decisions
  • Continuous Approximation Arbitrary Continuous
    Functions (and more) (Cybenko Theorem)
  • Learnability
  • Change Mc-P neurons to Sigmoid etc.
  • Derive backprop using chain rule. (Like LMS
    TheoremSample Feed forward Network (No loops)

12
Replacement of Threshold Neurons with Sigmoid or
Differentiable Neurons
  • Sigmoid
  • Threshold

13
Prediction
  • delay
  • Compare
  • Input/Output
  • NN

14
Sample Feed forward Network (No loops)
  • Weights
  • Output
  • Weights
  • Weights
  • Input
  • Wji
  • Vik

F(S wji xj
Write a Comment
User Comments (0)
About PowerShow.com