Learning with Perceptrons and Neural Networks - PowerPoint PPT Presentation

About This Presentation

Title:

Learning with Perceptrons and Neural Networks

Description:

Learning with Perceptrons and Neural Networks – PowerPoint PPT presentation

Number of Views:209

Avg rating:3.0/5.0

Slides: 29

Provided by: ginal5

Learn more at: https://www.classes.cs.uchicago.edu

Category:

more less

Transcript and Presenter's Notes

Title: Learning with Perceptrons and Neural Networks

1
Learning with Perceptronsand Neural Networks

Artificial Intelligence
CMSC 25000
February 14, 2002

2
Agenda

Neural Networks
Biological analogy
Perceptrons Single layer networks
Perceptron training Perceptron convergence
theorem
Perceptron limitations
Neural Networks Multilayer perceptrons
Neural net training Backpropagation
Strengths Limitations
Conclusions

3
Neurons The Concept
Dendrites
Axon
Nucleus
Cell Body
Neurons Receive inputs from other neurons (via
synapses) When input exceeds threshold,
fires Sends output along axon to other
neurons Brain 1011 neurons, 1016 synapses
4
Artificial Neural Nets

Simulated Neuron
Node connected to other nodes via links
Links axonsynapselink
Links associated with weight (like synapse)
Multiplied by output of node
Node combines input via activation function
E.g. sum of weighted inputs passed thru
threshold
Simpler than real neuronal processes

5
Artificial Neural Net
w
x
w
Sum Threshold
x
w
x
6
Perceptrons

Single neuron-like element
Binary inputs
Binary outputs
Weighted sum of inputs gt threshold
(Possibly logic box between inputs and weights)

7
Perceptron Structure
y
w0
wn
w1
w3
w2
x0-1
x1
x3
x2
xn
. . .
compensates for threshold
x0 w0
8
Perceptron Convergence Procedure

Straight-forward training procedure
Learns linearly separable functions
Until perceptron yields correct output for all
If the perceptron is correct, do nothing
If the percepton is wrong,
If it incorrectly says yes,
Subtract input vector from weight vector
Otherwise, add input vector to weight vector

9
Perceptron Convergence Example

LOGICAL-OR
Sample x1 x2 x3 Desired Output
1 0 0 1
0
2 0 1 1
1
3 1 0 1
1
4 1 1 1
1
Initial w(0 0 0)After S2, wws2(0 1 1)
Pass2 S1ww-s1(0 1 0)S3wws3(1 1 1)
Pass3 S1ww-s1(1 1 0)

10
Perceptron Convergence Theorem

If there exists a vector W s.t.
Perceptron training will find it
Assume v.x gt

for
all ive examples x
wx1x2..xk, v.wgt k
w2 increases by at most 1, in each iteration
wx2 lt w21..w2 ltk ( mislabel)
v.w/w gt k / lt 1

Converges in k lt (1/ )2
steps

11
Perceptron Learning

Perceptrons learn linear decision boundaries
E.g.

x2

0
But not
0

x1
xor
X1 X2 -1 -1 w1x1 w2x2 lt 0 1
-1 w1x1 w2x2 gt 0 gt implies w1 gt 0 1
1 w1x1 w2x2 gt0 gt but should be
false -1 1 w1x1 w2x2 gt 0 gt implies
w2 gt 0
12
Neural Nets

Multi-layer perceptrons
Inputs real-valued
Intermediate hidden nodes
Output(s) one (or more) discrete-valued

X1
Y1 Y2
X2
X3
X4
Inputs
Hidden
Hidden
Outputs
13
Neural Nets

Pro More general than perceptrons
Not restricted to linear discriminants
Multiple outputs one classification each
Con No simple, guaranteed training procedure
Use greedy, hill-climbing procedure to train
Gradient descent, Backpropagation

14
Solving the XOR Problem
o1
w11
Network Topology 2 hidden nodes 1 output
w13
x1
w01
w21
y
-1
w23
w12
w03
w22
x2
-1
w02
o2
Desired behavior x1 x2 o1 o2 y 0 0 0
0 0 1 0 0 1 1 0 1 0 1
1 1 1 1 1 0
-1
Weights w11 w121 w21w22 1 w013/2 w021/2
w031/2 w13-1 w231
15
Backpropagation

Greedy, Hill-climbing procedure
Weights are parameters to change
Original hill-climb changes one parameter/step
Slow
If smooth function, change all parameters/step
Gradient descent
Backpropagation Computes current output, works
backward to correct error

16
Producing a Smooth Function

Key problem
Pure step threshold is discontinuous
Not differentiable
Solution
Sigmoid (squashed s function) Logistic fn

17
Neural Net Training

Goal
Determine how to change weights to get correct
output
Large change in weight to produce large reduction
in error
Approach
Compute actual output o
Compare to desired output d
Determine effect of each weight w on error d-o
Adjust weights

18
Neural Net Example
xi ith sample input vector w weight vector
yi desired output for ith sample
Sum of squares error over training samples
Full expression of output in terms of input and
weights
19
Gradient Descent