Title: Artificial Neural Networks
1Artificial Neural Networks
- Artificial Neural Networks are another technique
for supervised machine learning
k-Nearest Neighbor Decision Tree Logic
statements Neural Network
Training Data
Test Data
Clasification
2Human neuron
- Dendrites pick up signals from other neurons
- When signals from dendrites reach a threshold, a
signal is sent down axon to synapse
3Connection with AI
- Most modern AI
- Systems that act rationally
- Implementing neurons in a computer
- Systems that think like humans
- Why artificial neural networks then?
- Universal function fitter
- Potential for massive parallelism
- Some amount of fault-tolerance
- Trainable by inductive learning, like other
supervised learning teachniques
4Perceptron Example
1 malignant 0 benign
of tumors
w1 -0.1
Output Unit
w2 0.9
Avg area
Avg density
w3 0.1
Input Units
5The Perceptron Input Units
- Input units features in original problem
- If numeric, often scaled between 1 and 1
- If discrete, often create one input node for each
category - Can also assign values for a single node (imposes
ordering)
6The Perceptron Weights
- Weights Represent importance of each input unit
- Combined with input units to feed output units
- The output unit receives as input
7The Perceptron Output Unit
- The output unit uses an activation function to
decide what the correct output is - Sample activation function
8Simplifying the threshold
- Managing the threshold is cumbersome
- Incorporate as a virtual weight
9How do we compute weights?
- Initialize all weights randomly
- Usually between -0.5, 0.5
- Put the first point through the network
- Actual Output
- Define Error Correct Output Actual Output
10Perceptron Example
1 malignant 0 benign
of tumors
w1 -0.3
w0 0
Output Unit
w2 -0.2
Actual Output 0 Correct Output 1 Error 1
0 1 If input is positive, wantweight to be
more positiveIf input is negative, wantweight
to be more negative
Avg area
Avg density
w3 0.4
Input Units
11The Perceptron Learning Rule How do we compute
weights?
- Put the first point through the network
- Actual Output
- Define Error Correct Output Actual Output
- Update all weights
- a learning rate
- Repeat with all points, then all points again,
and again, until all correct or stopping
criterion reached
12Can appropriate weights always be found?
- ONLY IF data is linearly separable
13What if data is not linearly separable? Neural
Network.
O
Vj
- Each hidden unit is a perceptron
- The output unit is another perceptron with hidden
units as input
14How to compute weights for a multilayer neural
network?
- Need to redefine perceptron
- Step function no good need something
differentiable - Replace with sigmoid approximation
15Sigmoid function
- Good approximation to step function
- As b?infinity,sigmoid ? step
- Well just take b 1 for simplicity
16Computing weights backpropogation
- Think of as a gradient descent method, where
weights are variables and trying to minimize
error
17Minimize squared errors
- For all training points, let
- Tp correct output
- Op actual output
- Want to minimize error
- Work with one point at a time, and move weights
in direction to reduce error the most
18Expand (drop the p for simplicity)
- Direction of most rapid positive rate of change
(gradient) is given by partial derivative
Update rule for hidden layer
19Simplify as
- Input layer backprop is similar, but requires
some more chain rule partial derivatives
20Neural Networks and machine learning issues
- Neural networks can represent any training set,
if enough hidden units are used - How long do they take to train?
- How much memory?
- Does backprop find the best set of weights?
- How to deal with overfitting?
- How to interpret results?