Learning with Neural Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Learning with Neural Networks

Description:

Determine how to change weights to get correct output ... Which weights have greatest effect on error? Effectively, partial derivatives of error wrt weights ... – PowerPoint PPT presentation

Number of Views:26

Avg rating:3.0/5.0

Slides: 25

Provided by: ginal5

Learn more at: https://www.classes.cs.uchicago.edu

Category:

more less

Transcript and Presenter's Notes

Title: Learning with Neural Networks

1
Learning with Neural Networks

Artificial Intelligence
CMSC 25000
February 19, 2002

2
Agenda

Neural Networks
Biological analogy
Review single-layer perceptrons
Perceptron Pros Cons
Neural Networks Multilayer perceptrons
Neural net training Backpropagation
Strengths Limitations
Conclusions

3
Neurons The Concept
Dendrites
Axon
Nucleus
Cell Body
Neurons Receive inputs from other neurons (via
synapses) When input exceeds threshold,
fires Sends output along axon to other
neurons Brain 1011 neurons, 1016 synapses
4
Perceptron Structure
Single neuron-like element -Binary inputs
output -Weighted sum of inputs gt threshold
y
w0
wn
w1
w3
w2
x0-1
x1
x3
x2
xn
. . .
Until perceptron correct output for all If
the perceptron is correct, do nothing If the
percepton is wrong, If it incorrectly says
yes, Subtract input vector from weight
vector Otherwise, add input vector to it
compensates for threshold
x0 w0
5
Perceptron Learning

Perceptrons learn linear decision boundaries
E.g.
Guaranteed to converge, if linearly separable
Many simple functions NOT learnable

x2

0
But not
0

x1
xor
6
Neural Nets

Multi-layer perceptrons
Inputs real-valued
Intermediate hidden nodes
Output(s) one (or more) discrete-valued

X1
Y1 Y2
X2
X3
X4
Inputs
Hidden
Hidden
Outputs
7
Neural Nets

Pro More general than perceptrons
Not restricted to linear discriminants
Multiple outputs one classification each
Con No simple, guaranteed training procedure
Use greedy, hill-climbing procedure to train
Gradient descent, Backpropagation

8
Solving the XOR Problem
o1
w11
Network Topology 2 hidden nodes 1 output
w13
x1
w01
w21
y
-1
w23
w12
w03
w22
x2
-1
w02
o2
Desired behavior x1 x2 o1 o2 y 0 0 0
0 0 1 0 0 1 1 0 1 0 1
1 1 1 1 1 0
-1
Weights w11 w121 w21w22 1 w013/2 w021/2
w031/2 w13-1 w231
9
Backpropagation

Greedy, Hill-climbing procedure
Weights are parameters to change
Original hill-climb changes one parameter/step
Slow
If smooth function, change all parameters/step
Gradient descent
Backpropagation Computes current output, works
backward to correct error

10
Producing a Smooth Function

Key problem
Pure step threshold is discontinuous
Not differentiable
Solution
Sigmoid (squashed s function) Logistic fn

11
Neural Net Training

Goal
Determine how to change weights to get correct
output
Large change in weight to produce large reduction
in error
Approach
Compute actual output o
Compare to desired output d
Determine effect of each weight w on error d-o
Adjust weights

12
Neural Net Example
xi ith sample input vector w weight vector
yi desired output for ith sample
-
Sum of squares error over training samples
From 6.034 notes lozano-perez
Full expression of output in terms of input and
weights
13
Gradient Descent

Error Sum of squares error of inputs with
current weights
Compute rate of change of error wrt each weight
Which weights have greatest effect on error?
Effectively, partial derivatives of error wrt
weights
In turn, depend on other weights gt chain rule

14
Gradient Descent
dG dw
E

E G(w)
Error as function of weights
Find rate of change of error
Follow steepest rate of change
Change weights s.t. error is minimized

G(w)
w0w1
w
Local minima
15
Gradient of Error
-
Note Derivative of sigmoid ds(z1)
s(z1)(1-s(z1)) dz1
From 6.034 notes lozano-perez
16
From Effect to Update