Study of Nearest Point Algorithm for SVM Classifier Design presentation

About This Presentation

Transcript and Presenter's Notes

Title: Study of Nearest Point Algorithm for SVM Classifier Design

1
Study of Nearest Point Algorithm for SVM
Classifier Design

EE645 Final Project
Hui Ou

2
Introduction

Classical method to solve SVM optimizing problem.
Quadratic Programming Problem Require enormous
matrix storages and intensive matrix operations.
Fast iterative Algorithms were introduced to
solve SVM.
Solve the QP problems analytically in the dual
space.
Chunking
Decomposition Method
Sequential Minimal Optimization (SMO)
Solve the Nearest Point Problem directly based on
the geometric interpretation.
Goal of classification is to find the best
decision rule to separate two classes (U and V)
of points.
The best decision boundary can be constructed by
finding the two closest points in the two convex
hulls generated respectively by the two classes.

3
Motivation

SVM has been formulated as a special sort of
optimization problem, and NPA is a method to
solve it in geometric way.
The SVM classification problem is converted to a
problem of computing the nearest point between
two convex polytopes. (Two category
classification problem is considered.)
The performance of NPA is competitive with the
SMO and SVM-light.

4
Contents

Introduction of Nearest Point Problem.
General idea of NPP.
Reformulation of SVM as a Nearest Point Problem.
Hard convex hull based on L-2 norm SVM.
Soft convex hull based on L-1 norm SVM.
Optimal Criteria for NPP
Iterative Algorithms based on L-2 norm for NPP.
Gilberts Algorithm
Mitchell-Demyanov-Malozemov Algorithm (MDM)
A fast iterative algorithm combined the idea of
the two algorithms.
Discussion of L-1 norm.
Simulation results.
Conclusion.

5
General Idea of Nearest Point Algorithm

Let U and V denote the two classes.
Among all pairs of parallel hyperplanes that
separate the two classes, the pair with the
largest margin is the one which has (u - v) as
the normal direction, where (u,v) is a pair of
closest points of U and V.
Solution of NPP will be

6
Reformulation of SVM as A Nearest Point Problem

Notation
X Input vector of the support vector machine.
Z Feature space vector, zf(x).
As in all SVM designs, we do not assume f to be
known
All computations will be done using only the
kernel function
I, J The index set for class 1 and class 2
respectively.
The SVM problem
Without violation s.t.
With violation, depends on the slack variables,
there will be two cases
L-1 norm, such as v-SVM
L-2 norm, use a sum of squared violations in the
cost function.

7
Reformulation of SVM as A Nearest Point Problem

Based on L-2 norm, given a set S we use coS to
denote the hard convex hull of S.
Since I and J are finite sets, U and V are
convex polytopes.
Based on L-1 norm, we can define a soft convex
hull, which has one more constraint due to the
definition of v-SVM.
Nearest Point Problem
We can rewrite the constraints of NPP as

8
Optimality Criteria For NPP

It is well known that the maximum of a linear
function over a convex polytope is attained by an
extreme point.
We can search in the direction of ?, and find
u(i) which can maximize ?.u in U and the points
in V, we search for v(j) that can maximize ?.v.
After that, we search in the line segment
cou,u(i), and cov,v(j), to find the pair of
points that can minimize the distance between the
two convex hulls.

9
Algorithms For NPP

The best general-purpose algorithms for NPP such
as Wolfes algorithm terminate within a finite
number of steps.
However, they require expensive matrix storage
and matrix operations in each step.
Unsuitable for large SVM design.
Iterative algorithms
Memory size needed is linear to the number of
training vectors.
Reach the solution asymptotically as the number
of iteration goes to infinity.
Better suited for SVM design.
Some popular iterative algorithms for NPP
Gilberts Algorithm
Mitchell-Demyanov-Malozemov Algorithm (MDM)

10
Gilberts Algorithm

Gilberts Algorithm was one of the first
algorithms suggested for solving NPP.
NPP is equivalent to solve the following minimum
norm problem
Gilbert Step
Choose z of Z.
If this point minimize z2, then stop with zz
else set zu - v, where u and v maximize -zu
and zv respectively.
Compute z, the point on the line segment
joining z and z which has least norm. Set zz,
and back to step 2.

11
A Few Iterations of Gilberts Algorithm on A Two
Dimensional Example
12
Mitchell-Demyanov-Malozemov Algorithm (MDM)

Unlike Gilberts algorithm, MDM algorithm
fundamentally uses the representation, z St
?t(zi(t)-zj(t)) in its basic operations.
In this case, only ?t and i(t), j(t) need to
be stored and maintained in order to represent z.
In each iteration, MDM algorithm attempts to
decrease the following ?(z)
?(z)minzz, for z in Z min-z
(zi(t)-zj(t)) for t which has ?tgt0.
MDM looks for improvement of z along the
direction of z.

13
The Idea of MDM Iteration

Azi(tmin)-zj(tmin) Bz, which can minimize
zz.
MDM algorithm tries to crush the total slab
toward zero, while Gilberts Algorithm only
attempts to push the lower slab to zero.

14
Comments on The Two Algorithms

Gilberts algorithm makes rapid movement toward
the solution during its initial iterations,
however, it will be very slow as it approached
the final solution. This is because, when z
get small, it will be slow in driving the norm of
z to be smaller.
MDM algorithm works faster than Gilberts
algorithm, especially in the end stages when z
approaches z.
Algorithms which are much faster than MDM
algorithm can be designed using the following two
observations
It is easy to combine the ideas of Gilberts
algorithm and MDM algorithm into a hybrid
algorithm which is faster.
Working directly in the space where U and V are
located.

15
Combination of Gilberts Algorithm and MDM
Algorithm
16
A Fast Iterative Algorithm For NPP

In this section we will discuss an algorithm for
NPP directly in the space in which U and V are
located.
Key idea is combine the Gilberts Algorithm and
MDM Algorithm together.
Cost function is in L2-norm.
The stop criteria for this algorithm
Lets define the stop condition first.
We say that an index, k satisfies stop condition
at (u, v) if
If we can find such an index k (lets suppose it
is in I), then in the line segment joining u and
zk, there must be a point u which is closer to v
than u.
If we can not find an index k that satisfies the
above condition, then it will be a good time to
stop the algorithm.

17
A Fast Iterative Algorithm for NPP

Steps
Choose u in U, v in V and set zu-v.
Find an index k satisfying the stop condition. If
such an index cannot be found, stop with the
conclusion that the approximate optimality
criterion is satisfied. Else go to step 3 with
the k found.
Choose two convex polytopes
Compute (u, v) to be a pair of closest points
minimizing the distance between U and V.
Set uu, vv and go back to step 2.
In step 3, suppose we choose U as the line
segment joining u and zk, and Vv. (Lets
suppose k in I.) Then the algorithm is more close
to Gilberts algorithm.

18
Final Solution of NPP

Support Vectors
For the zk, where is served as a
support vector.
Use the sign of the following function to decide
if a given x belongs to Class 1 or Class 2.

19
Simulations

The fast iterative nearest point algorithm in L-2
norm is implemented in MATLAB, and tested in the
following data sets
WDBC data
Problem Set 1 5 data set
Iris data
The results are compared with SVM-light.

20
Simulations

WDBC data
Randomly choose 304 examples as training
examples, test on the other 202 examples.
Percentage misclassification on the test set

Gaussian Kernel Polynomial Kernel
NPA 10.2 Optimal criteria can not satisfy.
SVM-light 33.9 6.6
21
NPA Performance for WDBC Data Gaussian Kernel
22
Simulations

PS1, problem 5 data
100 training examples, 1000 test examples.
Percentage misclassification on the test set

Gaussian Kernel Polynomial Kernel
NPA 7.4 5.4
SVM-light 7 5.3
23
NPA Performance for PS1 5 Data Gaussian Kernel
24
NPA Performance for PS1 5 Data Polynomial
Kernel
25
Simulations

Iris data
Separate the second 50 data from the other 100
data. Randomly pick 50 data as training data, the
other 100 as testing data.
Separate the third 50 data from the other 100
data. Randomly pick 50 data as training data, the
other 100 as testing data.

Gaussian Kernel Polynomial Kernel
NPA 2 5
SVM-light 4 5
Gaussian Kernel Polynomial Kernel
NPA 3 4
SVM-light 5 4
26
NPA Performance for Iris Data 1 Gaussian Kernel
27
NPA Performance for Iris Data 1 Polynomial
Kernel
28
NPA Performance for Iris Data 2 Gaussian Kernel
29
NPA Performance for Iris Data 2 Polynomial
Kernel
30
Conclusion for Simulation Results

Advantages
The misclassified percentage is competitive with
SVM-light.
The algorithm is quite straightforward to
implement.
Disadvantages
The CPU time needed for training a neural network
using Nearest Point Algorithm is more than that
of SVM-light.
For some data sets, the optimality criteria for
the Nearest Point Algorithm cannot be satisfied,
and the algorithm cannot improve the situation
anymore.

31
Discussion of NPP based on L-1 Norm Cost Function

The Gilberts algorithm and MDM algorithm based
on L-2 norm could be modified to solve L-1 norm
v-SVM classification problems, using the soft
convex hulls.
The simulation results, as provided by Qing Tao
and Gao-wei Wu 2, shows that it is not as good
as L-2 norm algorithms.
The classification error is larger than MDM
algorithm.
The computational cost is also more than that of
MDM algorithm.

32
Conclusion for NPA

The Nearest Point Algorithm provides us a method
to solve the SVM classification problem on
geometric interpretation.
For two class classification problems, NPA is
competitive with other fast iterative algorithms,
such as SVM-light.
Unfortunately, this algorithm is not as popular
the other fast iterative algorithms due to the
following reasons
For two classes classification problems as
examined above, the computational costs for NPA
are more than that of SVM-light.
For some data sets, the optimality criteria for
NPA cannot be satisfied.
If we use this algorithm for three or more
classes classification problems, it will have to
find the convex hulls and margins for every two
classes, which will need much more CPU time to
converge.
This algorithm is not suitable for SVM regression
problems.

33
Reference

S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, and
K.R.K. Murthy. A Fast Iterative Nearest Point
Algorithm for Support Vector Machine Classifier
Design. Online Available http//guppy.mpe.nus.e
du.sg/mpessk. Also in IEEE Transactions on
Neural Networks. 2000, 11(1)124-136.
Qing Tao, Gao-wei Wu. A General Soft Method for
Learning SVM Classifiers with L1 Norm.

Write a Comment

User Comments (0)

About PowerShow.com

Study of Nearest Point Algorithm for SVM Classifier Design PowerPoint PPT Presentation