Artificial Neural Networks - PowerPoint PPT Presentation

1 / 41

About This Presentation

Title:

Artificial Neural Networks

Description:

A set of algebraic equations and functions which determine the best output given ... Decision Systems Group, Brigham and Women's Hospital, Department of Radiology ... – PowerPoint PPT presentation

Number of Views:190

Avg rating:3.0/5.0

Slides: 42

Provided by: Btal

Category:

more less

Transcript and Presenter's Notes

Title: Artificial Neural Networks

1
Artificial Neural Networks
Articifial Intelligence

Brian Talecki
CSC 8520
Villanova University

2
ANN - Artificial Neural Network

A set of algebraic equations and functions which
determine the best output given a set of inputs.
An artificial neural network is modeled on a very
simplified version of the a human neuron which
make up the human nervous system.
Although the brain operates at 1 millionth the
speed of modern computers, it functions faster
than computers because of the parallel processing
structure of the nervous system.

3
Human Nerve Cell

picture from G5AIAI Introduction to AI by
Graham Kendall
www.cs.nott.ac.uk/gxk/courses/g5aiai

At the synapse the nerve cell releases a
chemical compounds called neurotransmitters,
which excite or inhibit a chemical / electrical
discharge in the neighboring nerve cells.
The summation of the responses of the adjacent
neurons will elicit the appropriate response in
the neuron.

5
Brief History of ANN

McCulloch and Pitts (1943) designed the first
neural network
Hebb (1949) who developed the first learning
rule. If two neurons were active at the same time
then the strength between them should be
increased.
Rosenblatt (1958) introduced the concept of a
perceptron which performed pattern recognition.
Widrow and Hoff (1960) introduced the concept of
the ADALINE (ADAptive Linear Element) . The
training rule was based on the idea of
Least-Mean-Squares learning rule which minimizing
the error between the computed output and the
desired output.
Minsky and Papert (1969) stated that the
perceptron was limited in its ability to
recognize features that were separated by linear
boundaries. Neural Net Winter
Kohonen and Anderson independently developed
neural networks that acted like memories.
Webros(1974) developed the concept of back
propagation of an error to train the weights of
the neural network.
McCelland and Rumelhart (1986) published the
paper on back propagation algorithm. Rebirth of
neural networks.
Today - they are everywhere a decision can be
made.

Source G5AIAI - Introduction to Artificial
Intelligence Graham Kendall
6
Basic Neural Network
b - Bias
? Wp b
f()
Inputs
?
W
Outputs

Inputs normally a vector of measured parameters
Bias may/may not be added
f() transfer or activation function
Outputs f(? W p b)

T
7
Activation Functions
Source Supervised Neural Network Introduction
CISC 873. Data Mining Yabin Meng
8
Log Sigmoidal Function
Source Artificial Neural Networks Colin P.
Fahey http//www.colinfahey.com/2003apr20_neuron/i
ndex.htm
9
Hard Limit Function
y
1.0
x
-1.0
10
Log Sigmoid and Derivative
Source The Scientist and Engineers Guide to
Digital Signal Processing by Steven Smith
11
Derivative of the Log Sigmoidal Function
-1

s(x) (1 e )
s(x) -(1e ) (-e )
e (1 e )
( e ) ( 1 )
(1 e ) ( 1 e )
(1 e 1) ( 1 )
( 1 e ) ( 1 e )
(1 - ( 1 ) ) ( 1
)
(1 e ) (1 e
)
s(x) (1-s(x)) s(x)

-x
-2
-x
-x
-2
-x
-x
-x
-x
-x
-x
-x
-x
-x
-x
Derivative is important for the back error
propagation algorithm used to train multilayer
neural networks.
12
Example Single Neuron

Given W 1.3, p 2.0, b 3.0
Wp b 1.3(2.0) 3.0 5.6
Linear
f(5.6) 5.6
Hard limit
f(5.6) 1.0
Log Sigmoidal
f(5.6) 1/(1exp(-5.6)
1/(10.0037)
.9963

13
Simple Neural Network
One neuron with a linear activation function gt
Straight Line Recall the equation of a straight
Line y mx b m is the slope
(weight), b is the y-intercept (bias).
p2
Bad
Good
Mp1 b gt p2
Decision Boundary
Mp1 b lt p2
p1
14
Perceptron Learning

Extend our simple perceptron to two inputs and
hard limit activation function

bias
W
F()
Output
?
p1
W1
Hard limit function
W2
p2
o f (? W p b) W is the weight matrix p is
the input vector o is our scalar output
T
15
Rules of Matrix Math

Addition/Subtraction
1 2 3 9 8 7 10 10
10
4 5 6 /- 6 5 4 10 10 10
7 8 9 3 2 1 10 10
10
Multiplication by a scalar
Transpose
a 1 2 a 2a
1 1 2
3 4 3a 4a
2
Matrix Multiplication
2 4 5 18 , 5
2 4 10 20
2
2 4 8

T
16
Data Points for the AND Function

q1 0 , o1 0
0
q2 1 , o2 0
0
q3 0 , o3 0
1
q4 1 , o4 1
1

Truth Table P1 P2 O
0 0 0 0
1 0 1 0
0 1 1 1
17
Weight Vector and the Decision Boundary

W 1.0
1.0

Magnitude and Direction
Decision Boundary is the line where W p b or
W p b 0
T
T
As we adjust the weights and biases of the neural
network, we change the magnitude and direction of
the weight vector or the slope and intercept of
the decision boundary
T
W p gt b
T
W p lt b
18
Perceptron Learning Rule

Adjusting the weights of the Perceptron
Perceptron Error Difference between the desired
and derived outputs.
e Desired Derived
When e 1
W new Wold p
When e -1
W new Wold - p
When e 0
W new Wold
Simplifing
W new Wold ? ep
b new bold e
? is the learning rate (
1 for the perceptron).

19
AND Function Example

Start with W1 1, W2 1, and b -1
W p b gt t - a
e
1 1 0 -1 gt 0 - 0
0 N/C
0
1 1 0 -1 gt 0 - 1
-1
1
1 0 1 -2 gt 0 - 0
0 N/C
0
1 0 1 -2 gt 1 - 0
1
1

T
20
T

W p b gt t - a
e
2 1 0 -1 gt 0 - 0
0 N/C
0
2 1 0 -1 gt 0 - 1
-1
1
2 0 1 -2 gt 0 - 1
-1
0
1 0 1 -3 gt 1 - 0
1
1

21
T

W p b gt t - a
e
2 1 0 -2 gt 0 - 0
0 N/C
0
2 1 0 -2 gt 0 - 0
0 N/C
1
2 1 1 -2 gt 0 - 1
-1
0
1 1 1 -3 gt 1 - 0
1
1

22
T

W p b gt t - a
e
2 2 0 -2 gt 0 - 0
0 N/C
0
2 2 0 -2 gt 0 - 1
-1
1
2 1 1 -3 gt 0 - 0
0 N/C
0
2 1 1 -3 gt 1 - 1
0 N/C
1

23
T

W p b gt t - a
e
2 1 0 -3 gt 0 - 0
0 N/C
0
2 1 0 -3 gt 0 - 0
0 N/C
1
Done !

2
p1
S
f()
1
p2
Hardlim()
-3
24
XOR Function

Truth Table
X Y Z (X and not Y) or (not X
and Y)
0 0 0
0 1 1
1 0 1
1 1 0

1
0
No single decision boundary can separate the
favorable and unfavorable outcomes.
x
Circuit Diagram
y
z
We will need a more complicated neural net to
realize this function
25
XOR Function Multilayer Perceptron
x
W1
S
f1()
W5
W4
b11
W2
f()
b2
b12
W6
S
y
W3
f1()
z
Z f (W5f1(W1x W4yb11) W6f1(W2x
W3yb12)b2)
Weights of the neural net are independent of each
other, so that we can compute the partial
derivatives of z with respect to the weights of
the network.
i.e. dz / dW1, dz / dW2, dz / dW3, dz /
dW4, dz / dW5, dz / dW6
26
Back Propagation Diagram
Neural Networks and Logistic Regression by Lucila
Ohno-Machado Decision Systems Group, Brigham and
Womens Hospital, Department of Radiology
27
Back Propagation Algorithm

This algorithm to train Artificial Neural
Networks (ANN) depends to two basic concepts
a) Reduced the Sum Squared Error, SSE, to
an
acceptable value.
b) Reliable data to train your network
under
your supervision.
Simple case Single input no bias neural net.

W1
W2
z
x
f1
f2
a1
n2
n1
T desired output
28
BP Equations

n1 W1 x
a1 f1(n1) f1(W1 x)
n2 W2 a1 W2 f1(n1) W2 f1(W1 x)
z f2(n2) f2(W2 f1(W1 x))
SSE ½ (z T)
Lets now take the partial derivatives
dSSE/ dW2 (z - T) d(z - T)/ dW2 (z T)
dz/ dW2
(z - T) df2(n2)/dW2
Chain Rule
df2(n2)/dW2 (df2(n2)/dn2) (dn2/dW2)
(df2(n2)/dn2) a1
dSSE/ dW2 (z - T) (df2(n2)/dn2) a1
Define ? to our learning rate (0 lt ? lt 1,
typical ? 0.2)
Compute our new weight
W2(k1) W2(k) - ? (dSSE/ dW2)
W2(k) - ? ((z - T)
(df2(n2)/dn2) a1)

2
29

Sigmoid function
df2(n2)/dn2 f2(n2)(1 f2(n2))
z(1 z)
Therefore
W2(k1) W2(k) - ? ((z - T) (
z(1 z) ) a1)
Analysis for W1
n1 W1 x
a1 f1(W1x)
n2 W2 f1(n1) W2 f1(W1 x)
dSSE/ dW1 (z - T) d(z -T )/ dW1 (z - T)
dz/ dW1
(z - T) df2(n2)/dW1
df2(n2)/dW1 (df2(n2)/dn2) (dn2/dW1) -gt
Chain Rule
dn2/dW1 W2 (df1(n1)/dW1)
W2 (df1(n1)/dn1)
(n1/dW1) -gt Chain Rule
W2 (df1(n1)/dn1)
x
dSSE/ dW1 (z - T ) (df2(n2)/dn2) W2
(df1(n1)/dn1) x
W1(k1) W1(k) - ? ((z - T )
(df2(n2)/dn2) W2 (df1(n1)/dn1) x)
df2(n2)/dn2 z (1 z) and
df1(n1)/dn1 a1 ( 1 a1)

30
Gradient Descent
Error
Local minimum
Global minimum
Training time
Neural Networks and Logistic Regression by Lucila
Ohno-Machado Decision Systems Group, Brigham and
Womens Hospital, Department of Radiology
31
2-D Diagram of Gradient Descent
Source Back Propagation algorithm by Olena
Lobunets www.essex.ac.uk/ccfea/Courses/
workshops03-04/Workshop4/Workshop204.ppt
32
Learning by Example

Training Algorithm backpropagation of errors
using gradient descent training.
Colors
Red Current weights
Orange Updated weights
Black boxes Inputs and outputs to a neuron
Blue Sensitivities at each layer

Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
33
First Pass
Gradient of the output neuron slope of the
transfer function error
G1 (0.6225)(1-0.6225)(0.0397)(0.5)(2)0.0093
G2 (0.6508)(1-0.6508)(0.3492)(0.5)0.0397
0.6508
1
0.6508
G3(1)(0.3492)0.3492
Gradient of the neuron G slope of the transfer
functionS(weight of the neuron to the next
neuron) (output of the neuron)
Error1-0.65080.3492
34
Weight Update 1
New WeightOld Weight (learning
rate)(gradient)(prior output)
0.5(0.5)(0.3492)(0.6508)
0.5(0.5)(0.0397)(0.6225)
0.5(0.5)(0.0093)(1)
0.5124
0.6136
0.5047
0.5124
0.5124
0.6136
0.5047
0.5124
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
35
Second Pass
G2 (0.6545)(1-0.6545)(0.1967)(0.6136)0.0273
G1 (0.6236)(1-0.6236)(0.5124)(0.0273)(2)0.0066
0.8033
1
0.8033
G3(1)(0.1967)0.1967
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
Error1-0.80330.1967
36
Weight Update 2
New WeightOld Weight (learning
rate)(gradient)(prior output)
0.6136(0.5)(0.1967)(0.6545)
0.5124(0.5)(0.0273)(0.6236)
0.5047(0.5)(0.0066)(1)
0.5209
0.6779
0.508
0.5209
0.5209
0.6779
0.508
0.5209
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
37
Third Pass
0.6504
0.6243
0.5209
0.8909
0.6779
0.508
1
0.5209
0.5209
0.6779
0.508
0.5209
0.8909
0.6504
0.6243
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
38
Weight Update Summary
W1 Weights from the input to the input layer W2
Weights from the input layer to the hidden
layer W3 Weights from the hidden layer to the
output layer
Source A Brief Overview of Neural
Networks Rohit Dua, Samuel A. Mulder, Steve E.
Watkins, and Donald C. Wunsch campus.umr.edu/smart
engineering/ EducationalResources/Neural_Net.ppt
39
ECG Interpretation
Neural Networks and Logistic Regression by Lucila
Ohno-Machado Decision Systems Group, Brigham and
Womens Hospital, Department of Radiology
40
Other Applications of ANN

Lip Reading Using Artificial Neural Network
Ahmad Khoshnevis, Sridhar Lavu, Bahar
Sadeghi
and Yolanda Tsang ELEC502 Course
Project
www-dsp.rice.edu/lavu/research/doc/502lavu.
ps
AI Techniques in Power Electronics and DrivesDr.
Marcelo G. Simões Colorado School of Mines
egweb.mines.edu/msimoes/tutorial
Car Classification with Neural Networks
Koichi Sato Sangho Park
hercules.ece.utexas.edu/course/
ee380l/1999sp/present/carclass.ppt
Face Detection and Neural Networks
Todd Wittman
www.ima.umn.edu/whitman/faces/face_detect
ion2.ppt
A Neural Network for Detecting and Diagnosing
Tornadic Circulations
V Lakshmanan, Gregory Stumpf, Arthur Witt
www.cimms.ou.edu/lakshman/Papers/mdann_talk.ppt