Artificial Neural Networks - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Artificial Neural Networks

Description:

A set of algebraic equations and functions which determine the best output given ... Decision Systems Group, Brigham and Women's Hospital, Department of Radiology ... – PowerPoint PPT presentation

Number of Views:190
Avg rating:3.0/5.0
Slides: 42
Provided by: Btal
Category:

less

Transcript and Presenter's Notes

Title: Artificial Neural Networks


1
Artificial Neural Networks
Articifial Intelligence
  • Brian Talecki
  • CSC 8520
  • Villanova University

2
ANN - Artificial Neural Network
  • A set of algebraic equations and functions which
    determine the best output given a set of inputs.
  • An artificial neural network is modeled on a very
    simplified version of the a human neuron which
    make up the human nervous system.
  • Although the brain operates at 1 millionth the
    speed of modern computers, it functions faster
    than computers because of the parallel processing
    structure of the nervous system.

3
Human Nerve Cell
  • picture from G5AIAI Introduction to AI by
    Graham Kendall
  • www.cs.nott.ac.uk/gxk/courses/g5aiai

4
  • At the synapse the nerve cell releases a
    chemical compounds called neurotransmitters,
    which excite or inhibit a chemical / electrical
    discharge in the neighboring nerve cells.
  • The summation of the responses of the adjacent
    neurons will elicit the appropriate response in
    the neuron.

5
Brief History of ANN
  • McCulloch and Pitts (1943) designed the first
    neural network
  • Hebb (1949) who developed the first learning
    rule. If two neurons were active at the same time
    then the strength between them should be
    increased.
  • Rosenblatt (1958) introduced the concept of a
    perceptron which performed pattern recognition.
  • Widrow and Hoff (1960) introduced the concept of
    the ADALINE (ADAptive Linear Element) . The
    training rule was based on the idea of
    Least-Mean-Squares learning rule which minimizing
    the error between the computed output and the
    desired output.
  • Minsky and Papert (1969) stated that the
    perceptron was limited in its ability to
    recognize features that were separated by linear
    boundaries. Neural Net Winter
  • Kohonen and Anderson independently developed
    neural networks that acted like memories.
  • Webros(1974) developed the concept of back
    propagation of an error to train the weights of
    the neural network.
  • McCelland and Rumelhart (1986) published the
    paper on back propagation algorithm. Rebirth of
    neural networks.
  • Today - they are everywhere a decision can be
    made.

Source G5AIAI - Introduction to Artificial
Intelligence Graham Kendall
6
Basic Neural Network
b - Bias
? Wp b
f()
Inputs
?
W
Outputs
  • Inputs normally a vector of measured parameters
  • Bias may/may not be added
  • f() transfer or activation function
  • Outputs f(? W p b)

T
7
Activation Functions
Source Supervised Neural Network Introduction
CISC 873. Data Mining Yabin Meng
8
Log Sigmoidal Function
Source Artificial Neural Networks Colin P.
Fahey http//www.colinfahey.com/2003apr20_neuron/i
ndex.htm
9
Hard Limit Function
y
1.0
x
-1.0
10
Log Sigmoid and Derivative
Source The Scientist and Engineers Guide to
Digital Signal Processing by Steven Smith
11
Derivative of the Log Sigmoidal Function
-1
  • s(x) (1 e )
  • s(x) -(1e ) (-e )
  • e (1 e )
  • ( e ) ( 1 )
  • (1 e ) ( 1 e )
  • (1 e 1) ( 1 )
  • ( 1 e ) ( 1 e )
  • (1 - ( 1 ) ) ( 1
    )
  • (1 e ) (1 e
    )
  • s(x) (1-s(x)) s(x)

-x
-2
-x
-x
-2
-x
-x
-x
-x
-x
-x
-x
-x
-x
-x
Derivative is important for the back error
propagation algorithm used to train multilayer
neural networks.
12
Example Single Neuron
  • Given W 1.3, p 2.0, b 3.0
  • Wp b 1.3(2.0) 3.0 5.6
  • Linear
  • f(5.6) 5.6
  • Hard limit
  • f(5.6) 1.0
  • Log Sigmoidal
  • f(5.6) 1/(1exp(-5.6)
  • 1/(10.0037)
  • .9963

13
Simple Neural Network
One neuron with a linear activation function gt
Straight Line Recall the equation of a straight
Line y mx b m is the slope
(weight), b is the y-intercept (bias).
p2
Bad
Good
Mp1 b gt p2
Decision Boundary
Mp1 b lt p2
p1
14
Perceptron Learning
  • Extend our simple perceptron to two inputs and
    hard limit activation function

bias
W
F()
Output
?
p1
W1
Hard limit function
W2
p2
o f (? W p b) W is the weight matrix p is
the input vector o is our scalar output
T
15
Rules of Matrix Math
  • Addition/Subtraction
  • 1 2 3 9 8 7 10 10
    10
  • 4 5 6 /- 6 5 4 10 10 10
  • 7 8 9 3 2 1 10 10
    10
  • Multiplication by a scalar
    Transpose
  • a 1 2 a 2a
    1 1 2
  • 3 4 3a 4a
    2
  • Matrix Multiplication
  • 2 4 5 18 , 5
    2 4 10 20
  • 2
    2 4 8

T
16
Data Points for the AND Function
  • q1 0 , o1 0
  • 0
  • q2 1 , o2 0
  • 0
  • q3 0 , o3 0
  • 1
  • q4 1 , o4 1
  • 1

Truth Table P1 P2 O
0 0 0 0
1 0 1 0
0 1 1 1
17
Weight Vector and the Decision Boundary
  • W 1.0
  • 1.0

Magnitude and Direction
Decision Boundary is the line where W p b or
W p b 0
T
T
As we adjust the weights and biases of the neural
network, we change the magnitude and direction of
the weight vector or the slope and intercept of
the decision boundary
T
W p gt b
T
W p lt b
18
Perceptron Learning Rule
  • Adjusting the weights of the Perceptron
  • Perceptron Error Difference between the desired
    and derived outputs.
  • e Desired Derived
  • When e 1
  • W new Wold p
  • When e -1
  • W new Wold - p
  • When e 0
  • W new Wold
  • Simplifing
  • W new Wold ? ep
  • b new bold e
  • ? is the learning rate (
    1 for the perceptron).

19
AND Function Example
  • Start with W1 1, W2 1, and b -1
  • W p b gt t - a
    e
  • 1 1 0 -1 gt 0 - 0
    0 N/C
  • 0
  • 1 1 0 -1 gt 0 - 1
    -1
  • 1
  • 1 0 1 -2 gt 0 - 0
    0 N/C
  • 0
  • 1 0 1 -2 gt 1 - 0
    1
  • 1

T
20
T
  • W p b gt t - a
    e
  • 2 1 0 -1 gt 0 - 0
    0 N/C
  • 0
  • 2 1 0 -1 gt 0 - 1
    -1
  • 1
  • 2 0 1 -2 gt 0 - 1
    -1
  • 0
  • 1 0 1 -3 gt 1 - 0
    1
  • 1

21
T
  • W p b gt t - a
    e
  • 2 1 0 -2 gt 0 - 0
    0 N/C
  • 0
  • 2 1 0 -2 gt 0 - 0
    0 N/C
  • 1
  • 2 1 1 -2 gt 0 - 1
    -1
  • 0
  • 1 1 1 -3 gt 1 - 0
    1
  • 1

22
T
  • W p b gt t - a
    e
  • 2 2 0 -2 gt 0 - 0
    0 N/C
  • 0
  • 2 2 0 -2 gt 0 - 1
    -1
  • 1
  • 2 1 1 -3 gt 0 - 0
    0 N/C
  • 0
  • 2 1 1 -3 gt 1 - 1
    0 N/C
  • 1

23
T
  • W p b gt t - a
    e
  • 2 1 0 -3 gt 0 - 0
    0 N/C
  • 0
  • 2 1 0 -3 gt 0 - 0
    0 N/C
  • 1
    Done !

2
p1
S
f()
1
p2
Hardlim()
-3
24
XOR Function
  • Truth Table
  • X Y Z (X and not Y) or (not X
    and Y)
  • 0 0 0
  • 0 1 1
  • 1 0 1
  • 1 1 0

1
0
No single decision boundary can separate the
favorable and unfavorable outcomes.
x
Circuit Diagram
y
z
We will need a more complicated neural net to
realize this function
25
XOR Function Multilayer Perceptron
x
W1
S
f1()
W5
W4
b11
W2
f()
b2
b12
W6
S
y
W3
f1()
z
Z f (W5f1(W1x W4yb11) W6f1(W2x
W3yb12)b2)
Weights of the neural net are independent of each
other, so that we can compute the partial
derivatives of z with respect to the weights of
the network.
i.e. dz / dW1, dz / dW2, dz / dW3, dz /
dW4, dz / dW5, dz / dW6
26
Back Propagation Diagram
Neural Networks and Logistic Regression by Lucila
Ohno-Machado Decision Systems Group, Brigham and
Womens Hospital, Department of Radiology
27
Back Propagation Algorithm
  • This algorithm to train Artificial Neural
    Networks (ANN) depends to two basic concepts
  • a) Reduced the Sum Squared Error, SSE, to
    an
  • acceptable value.
  • b) Reliable data to train your network
    under
  • your supervision.
  • Simple case Single input no bias neural net.

W1
W2
z
x
f1
f2
a1
n2
n1
T desired output
28
BP Equations
  • n1 W1 x
  • a1 f1(n1) f1(W1 x)
  • n2 W2 a1 W2 f1(n1) W2 f1(W1 x)
  • z f2(n2) f2(W2 f1(W1 x))
  • SSE ½ (z T)
  • Lets now take the partial derivatives
  • dSSE/ dW2 (z - T) d(z - T)/ dW2 (z T)
    dz/ dW2
  • (z - T) df2(n2)/dW2
  • Chain Rule
  • df2(n2)/dW2 (df2(n2)/dn2) (dn2/dW2)
  • (df2(n2)/dn2) a1
  • dSSE/ dW2 (z - T) (df2(n2)/dn2) a1
  • Define ? to our learning rate (0 lt ? lt 1,
    typical ? 0.2)
  • Compute our new weight
  • W2(k1) W2(k) - ? (dSSE/ dW2)
  • W2(k) - ? ((z - T)
    (df2(n2)/dn2) a1)

2
29
  • Sigmoid function
  • df2(n2)/dn2 f2(n2)(1 f2(n2))
    z(1 z)
  • Therefore
  • W2(k1) W2(k) - ? ((z - T) (
    z(1 z) ) a1)
  • Analysis for W1
  • n1 W1 x
  • a1 f1(W1x)
  • n2 W2 f1(n1) W2 f1(W1 x)
  • dSSE/ dW1 (z - T) d(z -T )/ dW1 (z - T)
    dz/ dW1
  • (z - T) df2(n2)/dW1
  • df2(n2)/dW1 (df2(n2)/dn2) (dn2/dW1) -gt
    Chain Rule
  • dn2/dW1 W2 (df1(n1)/dW1)
  • W2 (df1(n1)/dn1)
    (n1/dW1) -gt Chain Rule
  • W2 (df1(n1)/dn1)
    x
  • dSSE/ dW1 (z - T ) (df2(n2)/dn2) W2
    (df1(n1)/dn1) x
  • W1(k1) W1(k) - ? ((z - T )
    (df2(n2)/dn2) W2 (df1(n1)/dn1) x)
  • df2(n2)/dn2 z (1 z) and
    df1(n1)/dn1 a1 ( 1 a1)

30
Gradient Descent
Error
Local minimum
Global minimum
Training time
Neural Networks and Logistic Regression by Lucila
Ohno-Machado Decision Systems Group, Brigham and
Womens Hospital, Department of Radiology
31
2-D Diagram of Gradient Descent
Source Back Propagation algorithm by Olena
Lobunets www.essex.ac.uk/ccfea/Courses/
workshops03-04/Workshop4/Workshop204.ppt
32
Learning by Example
  • Training Algorithm backpropagation of errors
    using gradient descent training.
  • Colors
  • Red Current weights
  • Orange Updated weights
  • Black boxes Inputs and outputs to a neuron
  • Blue Sensitivities at each layer

Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
33
First Pass
Gradient of the output neuron slope of the
transfer function error
G1 (0.6225)(1-0.6225)(0.0397)(0.5)(2)0.0093
G2 (0.6508)(1-0.6508)(0.3492)(0.5)0.0397
0.6508
1
0.6508
G3(1)(0.3492)0.3492
Gradient of the neuron G slope of the transfer
functionS(weight of the neuron to the next
neuron) (output of the neuron)
Error1-0.65080.3492
34
Weight Update 1
New WeightOld Weight (learning
rate)(gradient)(prior output)
0.5(0.5)(0.3492)(0.6508)
0.5(0.5)(0.0397)(0.6225)
0.5(0.5)(0.0093)(1)
0.5124
0.6136
0.5047
0.5124
0.5124
0.6136
0.5047
0.5124
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
35
Second Pass
G2 (0.6545)(1-0.6545)(0.1967)(0.6136)0.0273
G1 (0.6236)(1-0.6236)(0.5124)(0.0273)(2)0.0066
0.8033
1
0.8033
G3(1)(0.1967)0.1967
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
Error1-0.80330.1967
36
Weight Update 2
New WeightOld Weight (learning
rate)(gradient)(prior output)
0.6136(0.5)(0.1967)(0.6545)
0.5124(0.5)(0.0273)(0.6236)
0.5047(0.5)(0.0066)(1)
0.5209
0.6779
0.508
0.5209
0.5209
0.6779
0.508
0.5209
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
37
Third Pass
0.6504
0.6243
0.5209
0.8909
0.6779
0.508
1
0.5209
0.5209
0.6779
0.508
0.5209
0.8909
0.6504
0.6243
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
38
Weight Update Summary
W1 Weights from the input to the input layer W2
Weights from the input layer to the hidden
layer W3 Weights from the hidden layer to the
output layer
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
39
ECG Interpretation
Neural Networks and Logistic Regression by Lucila
Ohno-Machado Decision Systems Group, Brigham and
Womens Hospital, Department of Radiology
40
Other Applications of ANN
  • Lip Reading Using Artificial Neural Network
  • Ahmad Khoshnevis, Sridhar Lavu, Bahar
    Sadeghi
  • and Yolanda Tsang ELEC502 Course
    Project
  • www-dsp.rice.edu/lavu/research/doc/502lavu.
    ps
  • AI Techniques in Power Electronics and DrivesDr.
    Marcelo G. Simões Colorado School of Mines
  • egweb.mines.edu/msimoes/tutorial
  • Car Classification with Neural Networks
  • Koichi Sato Sangho Park
  • hercules.ece.utexas.edu/course/
    ee380l/1999sp/present/carclass.ppt
  • Face Detection and Neural Networks
  • Todd Wittman
  • www.ima.umn.edu/whitman/faces/face_detect
    ion2.ppt
  • A Neural Network for Detecting and Diagnosing
    Tornadic Circulations
  • V Lakshmanan, Gregory Stumpf, Arthur Witt
    www.cimms.ou.edu/lakshman/Papers/mdann_talk.ppt

41
Bibliography
  • A Brief Overview of Neural Networks
  • Rohit Dua, Samuel A. Mulder, Steve E.
    Watkins, and Donald C. Wunsch
  • campus.umr.edu/smartengineering/
    EducationalResources/Neural_Net.ppt
  • Neural Networks and Logistic Regression
  • Lucila Ohno-Machado
  • Decision Systems Group,
  • Brigham and Womens Hospital,Department of
    Radiology
  • dsg.harvard.edu/courses/hst951/ppt/hst951_03
    20.ppt
  • G5AIAI Introduction to AI by Graham Kendall
  • Schooll of Computer Science and IT ,
    University of Nottingham
  • www.cs.nott.ac.uk/gxk/courses/g5aiai
  • The Scientist and Engineer's Guide to Digital
    Signal Processing
  • Steven W. Smith, Ph.D. California
    Technical Publishing
  • www.dspguide.com
  • Neural Network Design
Write a Comment
User Comments (0)
About PowerShow.com