Non-Bayes classifiers.

About This Presentation

Title:

Non-Bayes classifiers.

Description:

Initialization: initialize all weights with random values. ... Update weights: MLP issues. What is the best network configuration? ... – PowerPoint PPT presentation

Number of Views:16

Avg rating:3.0/5.0

Slides: 22

Provided by: cedarB

Learn more at: https://cedar.buffalo.edu

Category:

more less

Transcript and Presenter's Notes

Title: Non-Bayes classifiers.

1
Non-Bayes classifiers.

Linear discriminants,
neural networks.

2
Discriminant functions(1)
Bayes classification rule
Instead might try to find a function
is called discriminant function.
- decision surface
3
Discriminant functions (2)
Linear discriminant function
Decision surface is a hyperplane
4
Linear discriminant perceptron cost function
Replace
Thus now decision function is and decision
surface is
Perceptron cost function
where
5
Linear discriminant perceptron cost function
Perceptron cost function
Value of is proportional to the sum of
distances of all misclassified samples to the
decision surface.
If discriminant function separates classes
perfectly, then Otherwise,
and we want to minimize it.
is continuous and piecewise linear. So
we might try to use gradient descent algorithm.
6
Linear discriminant Perceptron algorithm
Gradient descent
At points where is differentiable
Thus
Perceptron algorithm converges when classes are
linearly separable with some conditions on
7
Sum of error squares estimation
Let denote as desired
output function, 1 for one class and 1 for the
other.
Want to find discriminant function whose output
is similar to
Use sum of error squares as similarity criterion
8
Sum of error squares estimation
Minimize mean square error
Thus
9
Neurons
10
Artificial neuron.
Above figure represent artificial neuron
calculating
11
Artificial neuron.
Threshold functions f
Step function
Logistic function
12
Combining artificial neurons
Multilayer perceptron with 3 layers.
13
(No Transcript)
14
Discriminating ability of multilayer perceptron
Since 3-layer perceptron can approximate any
smooth function, it can approximate
-
optimal discriminant function of two classes.
15
Training of multilayer perceptron
f
f
f
f
f
f
Layer r-1
Layer r
16
Training and cost function
Desired network output
Trained network output
Cost function for one training sample
Total cost function
Goal of the training find values of
which minimize cost function .
17
Gradient descent
Denote
Gradient descent
Since , we might want to
update weights after processing each training
sample separately
18
Gradient descent
Chain rule for differentiating composite
functions
Denote
19
Backpropagation
If rL, then
If rltL, then
20
Backpropagation algorithm