Classification and Regression - PowerPoint PPT Presentation

1 / 17

About This Presentation

Title:

Classification and Regression

Description:

{credit history, salary}- credit approval ( Yes/No) {Temp, Humidity} -- Rain (Yes/No) ... The data above the green line belongs to class x' The data below ... – PowerPoint PPT presentation

Number of Views:32

Avg rating:3.0/5.0

Slides: 18

Provided by: stephe87

Category:

more less

Transcript and Presenter's Notes

Title: Classification and Regression

1
Classification and Regression

What is classification? What is prediction?
Issues regarding classification and prediction
Classification by decision tree induction
Classification by Neural Networks
Classification by Support Vector Machines (SVM)
Bayesian Classification
Instance Based methods
Regression
Classification accuracy
Summary

2
Classification

Classification
predicts categorical class labels
Typical Applications
credit history, salary-gt credit approval (
Yes/No)
Temp, Humidity --gt Rain (Yes/No)

3
Linear Classification

Binary Classification problem
Earlier, known as linear discriminant
The data above the green line belongs to class
x
The data below green line belongs to class o
Examples SVM, Perceptron, Probabilistic
Classifiers

x
x
x
x
x
x
x
o
x
x
o
o
x
o
o
o
o
o
o
o
o
o
o
4
Fishers Linear Discriminant

From statistics.
Try to maximize
Where
m and s (m- and s-) are the mean and standard
deviation of one positive (negative) partition.
It tries to cut into two partitions such that
their means are as far apart as possible and
within each partition, the variance as small as
possible.
Skip details

5
Neural Networks

Analogy to Biological Systems (Indeed a great
example of a good learning system)
Massive Parallelism allowing for computational
efficiency
The first CS learning algorithm came in 1959
(Rosenblatt) who suggested that if a target
output value is provided for a single neuron with
fixed inputs, one can incrementally change
weights to learn to produce these outputs using
the perceptron learning rule

6
A Neuron

The n-dimensional input vector x is mapped into
variable y by means of the scalar product and a
nonlinear function mapping

7
A Neuron
8
A Neuron
Need to learn this
-
q
1
9
Perceptron Update Rule

How to get the weight vector?
Where
Can show that it converges when training data is
linearly separable and learning rate is small

Current output
Learning rate, typically 0.1
Actual class
10
x
x
x
o
x
x
x
o
x
o
o
o
x
o
o
x
11
Sigmoid activation function

Instead of a step function use sigmoid function
as activation function
differentiable 8-)
Use it to construct more complex neural network

12
Multi-Layer Perceptron
Output vector
Output nodes (Output Layer)
Hidden nodes (Hidden Layer)
Input nodes
wij
Input vector xi
13
Back propagation rule
Forward phase
Backward phase
14
Points to be aware of

Can further generalize to more layers.
But more layers can be bad. Typically two layers
are good enough.
The idea of back propagation is based on gradient
descent (will be covered in machine learning
course in greater detail I believe).
Most of the time, we get to a local minimum

Training error
Weight space
15
Network Training

The ultimate objective of training
obtain a set of weights that makes almost all the
tuples in the training data classified correctly
Steps
Initialize weights with random values
Feed the input tuples into the network one by one
For each unit
Compute the net input to the unit as a linear
combination of all the inputs to the unit
Compute the output value using the activation
function
Compute the error
Update the weights and the bias

16
Network Pruning and Rule Extraction

Network pruning
Fully connected network will be hard to
articulate
N input nodes, h hidden nodes and m output nodes
lead to h(mN) weights
Pruning Remove some of the links without
affecting classification accuracy of the network
Extracting rules from a trained network
Discretize activation values replace individual
activation value by the cluster average
maintaining the network accuracy
Enumerate the output from the discretized
activation values to find rules between
activation value and output
Find the relationship between the input and
activation value
Combine the above two to have rules relating the
output to input