Minimal Kernel Classifiers - PowerPoint PPT Presentation

About This Presentation
Title:

Minimal Kernel Classifiers

Description:

Quadratic programming (QP) formulation. Linear programming (LP) formulation ... Linear Programming Formulation. Use the 1-norm instead of the 2-norm: min. s.t. ... – PowerPoint PPT presentation

Number of Views:78
Avg rating:3.0/5.0
Slides: 26
Provided by: olvilman9
Category:

less

Transcript and Presenter's Notes

Title: Minimal Kernel Classifiers


1
Minimal Kernel Classifiers
Informs 2002San Jose, California, Nov 17-20,
2002
  • Glenn Fung
  • Olvi Mangasarian
  • Alexander Smola

Data Mining Institute University of Wisconsin -
Madison
2
Outline of Talk
  • Linear Support Vector Machines (SVM)
  • Linear separating surface
  • Quadratic programming (QP) formulation
  • Linear programming (LP) formulation
  • Nonlinear Support Vector Machines
  • Nonlinear kernel separating surface
  • LP formulation
  • The Minimal Kernel Classifier (MKC)
  • The pound loss function ()
  • MKC Algorithm
  • Numerical experiments
  • Conclusion

3
What is a Support Vector Machine?
  • An optimally defined surface
  • Linear or nonlinear in the input space
  • Linear in a higher dimensional feature space
  • Implicitly defined by a kernel function

4
What are Support Vector Machines Used For?
  • Classification
  • Regression Data Fitting
  • Supervised Unsupervised Learning

5
Generalized Support Vector Machines2-Category
Linearly Separable Case
A
A-
6
Support Vector MachinesMaximizing the Margin
between Bounding Planes
A
A-
7
Support Vector Machine FormulationAlgebra of
2-Category Linearly Separable Case
8
QP Support Vector Machine Formulation
9
Support Vector MachinesLinear Programming
Formulation
  • Use the 1-norm instead of the 2-norm
  • This is equivalent to the following linear
    program

10
Nonlinear Kernel LP Formulation
11
The Nonlinear Classifier
  • Where K is a nonlinear kernel, e.g.

12
Nonlinear PSVM Spiral Dataset94 Red Dots 94
White Dots
13
Model Simplification
  • Why? Minimizes number of kernel functions used.
  • Simplifies separating surface.
  • Goal 2 Minimize number of active constraints.
  • Why? Reduces data dependence.
  • Useful for massive incremental classification.

14
Model Simplification Goal 1Simplifying
Separating Surface
15
Model Simplification Goal 2Minimize Data
Dependence
  • By KKT conditions

Hence
16
Achieving Model SimplificationMinimal Kernel
Classifier Formulation
17
The (Pound) Loss Function
18
Approximating the Pound Loss Function
19
Minimal Kernel Classifier as a Concave
Minimization Problem
  • That can be effectively solved using the finite
  • Successive Linearization Algorithm (SLA)
  • (Magasarian 1996)

20
Minimal Kernel Algorithm (SLA)
21
Minimal Kernel Algorithm (SLA)
  • Each iteration of the algorithm solves a
    Linear program.
  • The algorithm terminates in a finite number of
    iterations (typically 5 to 7 iterations).
  • Solution obtained satisfies the Minimum
    Principle necessary optimality condition.

22
(No Transcript)
23
Checkerboard Separating Surface of Kernel
Functions27 of Active Constraints 30 o
24
Numerical ExperimentsResults for six public
datasets
25
Conclusion
  • A finite algorithm generating a classifier that
    depends on a fraction of input data only.
  • Important for fast online testing of unseen data,
    e.g. fraud or intrusion detection .
  • Useful for incremental training of massive data
  • Overall algorithm consists of solving 5 to 7
    LPs.
  • Kernel data dependence reduced up to 98.8 of
    the data used by a standard SVM.
  • Testing time reduction 98.2.
  • MKC testing set correctness comparable to that of
    a more complex standard SVM.
Write a Comment
User Comments (0)
About PowerShow.com