Study of Nearest Point Algorithm for SVM Classifier Design PowerPoint PPT Presentation

presentation player overlay
1 / 33
About This Presentation
Transcript and Presenter's Notes

Title: Study of Nearest Point Algorithm for SVM Classifier Design


1
Study of Nearest Point Algorithm for SVM
Classifier Design
  • EE645 Final Project
  • Hui Ou

2
Introduction
  • Classical method to solve SVM optimizing problem.
  • Quadratic Programming Problem Require enormous
    matrix storages and intensive matrix operations.
  • Fast iterative Algorithms were introduced to
    solve SVM.
  • Solve the QP problems analytically in the dual
    space.
  • Chunking
  • Decomposition Method
  • Sequential Minimal Optimization (SMO)
  • Solve the Nearest Point Problem directly based on
    the geometric interpretation.
  • Goal of classification is to find the best
    decision rule to separate two classes (U and V)
    of points.
  • The best decision boundary can be constructed by
    finding the two closest points in the two convex
    hulls generated respectively by the two classes.

3
Motivation
  • SVM has been formulated as a special sort of
    optimization problem, and NPA is a method to
    solve it in geometric way.
  • The SVM classification problem is converted to a
    problem of computing the nearest point between
    two convex polytopes. (Two category
    classification problem is considered.)
  • The performance of NPA is competitive with the
    SMO and SVM-light.

4
Contents
  • Introduction of Nearest Point Problem.
  • General idea of NPP.
  • Reformulation of SVM as a Nearest Point Problem.
  • Hard convex hull based on L-2 norm SVM.
  • Soft convex hull based on L-1 norm SVM.
  • Optimal Criteria for NPP
  • Iterative Algorithms based on L-2 norm for NPP.
  • Gilberts Algorithm
  • Mitchell-Demyanov-Malozemov Algorithm (MDM)
  • A fast iterative algorithm combined the idea of
    the two algorithms.
  • Discussion of L-1 norm.
  • Simulation results.
  • Conclusion.

5
General Idea of Nearest Point Algorithm
  • Let U and V denote the two classes.
  • Among all pairs of parallel hyperplanes that
    separate the two classes, the pair with the
    largest margin is the one which has (u - v) as
    the normal direction, where (u,v) is a pair of
    closest points of U and V.
  • Solution of NPP will be

6
Reformulation of SVM as A Nearest Point Problem
  • Notation
  • X Input vector of the support vector machine.
  • Z Feature space vector, zf(x).
  • As in all SVM designs, we do not assume f to be
    known
  • All computations will be done using only the
    kernel function
  • I, J The index set for class 1 and class 2
    respectively.
  • The SVM problem
  • Without violation s.t.
  • With violation, depends on the slack variables,
    there will be two cases
  • L-1 norm, such as v-SVM
  • L-2 norm, use a sum of squared violations in the
    cost function.




7
Reformulation of SVM as A Nearest Point Problem
  • Based on L-2 norm, given a set S we use coS to
    denote the hard convex hull of S.
  • Since I and J are finite sets, U and V are
    convex polytopes.
  • Based on L-1 norm, we can define a soft convex
    hull, which has one more constraint due to the
    definition of v-SVM.
  • Nearest Point Problem
  • We can rewrite the constraints of NPP as

8
Optimality Criteria For NPP
  • It is well known that the maximum of a linear
    function over a convex polytope is attained by an
    extreme point.
  • We can search in the direction of ?, and find
    u(i) which can maximize ?.u in U and the points
    in V, we search for v(j) that can maximize ?.v.
  • After that, we search in the line segment
    cou,u(i), and cov,v(j), to find the pair of
    points that can minimize the distance between the
    two convex hulls.

9
Algorithms For NPP
  • The best general-purpose algorithms for NPP such
    as Wolfes algorithm terminate within a finite
    number of steps.
  • However, they require expensive matrix storage
    and matrix operations in each step.
  • Unsuitable for large SVM design.
  • Iterative algorithms
  • Memory size needed is linear to the number of
    training vectors.
  • Reach the solution asymptotically as the number
    of iteration goes to infinity.
  • Better suited for SVM design.
  • Some popular iterative algorithms for NPP
  • Gilberts Algorithm
  • Mitchell-Demyanov-Malozemov Algorithm (MDM)

10
Gilberts Algorithm
  • Gilberts Algorithm was one of the first
    algorithms suggested for solving NPP.
  • NPP is equivalent to solve the following minimum
    norm problem
  • Gilbert Step
  • Choose z of Z.
  • If this point minimize z2, then stop with zz
    else set zu - v, where u and v maximize -zu
    and zv respectively.
  • Compute z, the point on the line segment
    joining z and z which has least norm. Set zz,
    and back to step 2.

11
A Few Iterations of Gilberts Algorithm on A Two
Dimensional Example
12
Mitchell-Demyanov-Malozemov Algorithm (MDM)
  • Unlike Gilberts algorithm, MDM algorithm
    fundamentally uses the representation, z St
    ?t(zi(t)-zj(t)) in its basic operations.
  • In this case, only ?t and i(t), j(t) need to
    be stored and maintained in order to represent z.
  • In each iteration, MDM algorithm attempts to
    decrease the following ?(z)
  • ?(z)minzz, for z in Z min-z
    (zi(t)-zj(t)) for t which has ?tgt0.
  • MDM looks for improvement of z along the
    direction of z.

13
The Idea of MDM Iteration
  • Azi(tmin)-zj(tmin) Bz, which can minimize
    zz.
  • MDM algorithm tries to crush the total slab
    toward zero, while Gilberts Algorithm only
    attempts to push the lower slab to zero.

14
Comments on The Two Algorithms
  • Gilberts algorithm makes rapid movement toward
    the solution during its initial iterations,
    however, it will be very slow as it approached
    the final solution. This is because, when z
    get small, it will be slow in driving the norm of
    z to be smaller.
  • MDM algorithm works faster than Gilberts
    algorithm, especially in the end stages when z
    approaches z.
  • Algorithms which are much faster than MDM
    algorithm can be designed using the following two
    observations
  • It is easy to combine the ideas of Gilberts
    algorithm and MDM algorithm into a hybrid
    algorithm which is faster.
  • Working directly in the space where U and V are
    located.

15
Combination of Gilberts Algorithm and MDM
Algorithm
16
A Fast Iterative Algorithm For NPP
  • In this section we will discuss an algorithm for
    NPP directly in the space in which U and V are
    located.
  • Key idea is combine the Gilberts Algorithm and
    MDM Algorithm together.
  • Cost function is in L2-norm.
  • The stop criteria for this algorithm
  • Lets define the stop condition first.
  • We say that an index, k satisfies stop condition
    at (u, v) if
  • If we can find such an index k (lets suppose it
    is in I), then in the line segment joining u and
    zk, there must be a point u which is closer to v
    than u.
  • If we can not find an index k that satisfies the
    above condition, then it will be a good time to
    stop the algorithm.

17
A Fast Iterative Algorithm for NPP
  • Steps
  • Choose u in U, v in V and set zu-v.
  • Find an index k satisfying the stop condition. If
    such an index cannot be found, stop with the
    conclusion that the approximate optimality
    criterion is satisfied. Else go to step 3 with
    the k found.
  • Choose two convex polytopes
  • Compute (u, v) to be a pair of closest points
    minimizing the distance between U and V.
  • Set uu, vv and go back to step 2.
  • In step 3, suppose we choose U as the line
    segment joining u and zk, and Vv. (Lets
    suppose k in I.) Then the algorithm is more close
    to Gilberts algorithm.

18
Final Solution of NPP
  • Support Vectors
  • For the zk, where is served as a
    support vector.
  • Use the sign of the following function to decide
    if a given x belongs to Class 1 or Class 2.

19
Simulations
  • The fast iterative nearest point algorithm in L-2
    norm is implemented in MATLAB, and tested in the
    following data sets
  • WDBC data
  • Problem Set 1 5 data set
  • Iris data
  • The results are compared with SVM-light.

20
Simulations
  • WDBC data
  • Randomly choose 304 examples as training
    examples, test on the other 202 examples.
  • Percentage misclassification on the test set

Gaussian Kernel Polynomial Kernel
NPA 10.2 Optimal criteria can not satisfy.
SVM-light 33.9 6.6
21
NPA Performance for WDBC Data Gaussian Kernel
22
Simulations
  • PS1, problem 5 data
  • 100 training examples, 1000 test examples.
  • Percentage misclassification on the test set

Gaussian Kernel Polynomial Kernel
NPA 7.4 5.4
SVM-light 7 5.3
23
NPA Performance for PS1 5 Data Gaussian Kernel
24
NPA Performance for PS1 5 Data Polynomial
Kernel
25
Simulations
  • Iris data
  • Separate the second 50 data from the other 100
    data. Randomly pick 50 data as training data, the
    other 100 as testing data.
  • Separate the third 50 data from the other 100
    data. Randomly pick 50 data as training data, the
    other 100 as testing data.

Gaussian Kernel Polynomial Kernel
NPA 2 5
SVM-light 4 5
Gaussian Kernel Polynomial Kernel
NPA 3 4
SVM-light 5 4
26
NPA Performance for Iris Data 1 Gaussian Kernel
27
NPA Performance for Iris Data 1 Polynomial
Kernel
28
NPA Performance for Iris Data 2 Gaussian Kernel
29
NPA Performance for Iris Data 2 Polynomial
Kernel
30
Conclusion for Simulation Results
  • Advantages
  • The misclassified percentage is competitive with
    SVM-light.
  • The algorithm is quite straightforward to
    implement.
  • Disadvantages
  • The CPU time needed for training a neural network
    using Nearest Point Algorithm is more than that
    of SVM-light.
  • For some data sets, the optimality criteria for
    the Nearest Point Algorithm cannot be satisfied,
    and the algorithm cannot improve the situation
    anymore.

31
Discussion of NPP based on L-1 Norm Cost Function
  • The Gilberts algorithm and MDM algorithm based
    on L-2 norm could be modified to solve L-1 norm
    v-SVM classification problems, using the soft
    convex hulls.
  • The simulation results, as provided by Qing Tao
    and Gao-wei Wu 2, shows that it is not as good
    as L-2 norm algorithms.
  • The classification error is larger than MDM
    algorithm.
  • The computational cost is also more than that of
    MDM algorithm.

32
Conclusion for NPA
  • The Nearest Point Algorithm provides us a method
    to solve the SVM classification problem on
    geometric interpretation.
  • For two class classification problems, NPA is
    competitive with other fast iterative algorithms,
    such as SVM-light.
  • Unfortunately, this algorithm is not as popular
    the other fast iterative algorithms due to the
    following reasons
  • For two classes classification problems as
    examined above, the computational costs for NPA
    are more than that of SVM-light.
  • For some data sets, the optimality criteria for
    NPA cannot be satisfied.
  • If we use this algorithm for three or more
    classes classification problems, it will have to
    find the convex hulls and margins for every two
    classes, which will need much more CPU time to
    converge.
  • This algorithm is not suitable for SVM regression
    problems.

33
Reference
  • S.S. Keerthi, S.K. Shevade, C. Bhattacharyya, and
    K.R.K. Murthy. A Fast Iterative Nearest Point
    Algorithm for Support Vector Machine Classifier
    Design. Online Available http//guppy.mpe.nus.e
    du.sg/mpessk. Also in IEEE Transactions on
    Neural Networks. 2000, 11(1)124-136.
  • Qing Tao, Gao-wei Wu. A General Soft Method for
    Learning SVM Classifiers with L1 Norm.
Write a Comment
User Comments (0)
About PowerShow.com