Multiclass SVM - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Multiclass SVM

Description:

Department of Computer Science and Engineering. Arizona State University ... In majority of experiments, 0 is in the confidence interval, meaning the ... – PowerPoint PPT presentation

Number of Views:353
Avg rating:3.0/5.0
Slides: 28
Provided by: thebi
Category:

less

Transcript and Presenter's Notes

Title: Multiclass SVM


1
Multi-class SVM
  • Jieping Ye
  • Department of Computer Science and Engineering
  • Arizona State University
  • http//www.public.asu.edu/jye02

2
Outline
  • Single machine approaches
  • Error correcting code approaches
  • Tree-structured approaches
  • Experiments.

3
Watson Watkins (1998)
  • Binary-class Learn one function. Penalize each
    machine separately based on the margin
    violations
  • Multi-class Pay a penalty based on the relative
    values output by the machines.

4
Watson Watkins (1998)
  • Learn N functions. If a point x is in class i,
    make

(k-1)n
5
Watson Watkins (1998)
  • Too many constraints and slack variables (k-1)n
  • Not easy to decompose (not scalable)
  • Experimental setup is problematic.

6
Crammer Singer (2001)
  • Watson Watkins paying each class
    for which
  • Crammer Singer Penalize for the largest

7
Crammer Singer (2001)
  • Watson Watkins
  • Crammer Singer

n(k-1)
n
8
Crammer Singer (2001)
  • Fewer slacks (compared to Watson Watkins )
  • Can be decomposed (more scalable)
  • Many tricks are developed and implemented for
    efficient training
  • C and R source codes available
  • http//www.cis.upenn.edu/crammer/code/MCSVM/MCSV
    M_1_0.tar.gz
  • R (http//www.r-project.org/ ) kernlab package

9
Outline
  • Single machine approaches
  • Error correcting code approaches
  • Tree-structured approaches
  • Experiments

10
Error-Correcting Code (ECC) Dietterich Bakiri
(1995)
0 1 0 0 0 0 0 0 0 0
Source Dietterich and Bakiri (1995)
11
(No Transcript)
12
(No Transcript)
13
One-against-rest
14
One-against-one
15
Special cases of ECC
Source http//www-cse.ucsd.edu/users/elkan/254spr
ing01/aldebaro1.pdf
16
Outline
  • Single machine approaches
  • Error correcting code approaches
  • Tree-structured approaches
  • Experiments

17
Large Margin Directed Acyclic Graph (DAG)
  • Identical to one-against-one at training time
  • At test time, DAG is used to determine which
    classifiers to test on a given point
  • Classes i and j are compared, whichever class
    achieves lower score is removed from further
    consideration
  • Repeat N-1 time, only one class remained.

18
Large Margin DAGs for Multiclass Classification
Source Platt et al. (2000)
19
Margin tree (Tibshirani and Hastie 2006)
  • SVM is constructed for each pair of classes to
    compute pair-wise margins
  • Agglomerative clustering uses the pair-wise
    margins as distances to construct the
    hierarchical structure bottom up.
  • Three approaches Greedy, single linkage, and
    complete linkage.

20
Margin tree
Source Tibshirani and Hastie (2006)
21
Outline
  • Single machine approaches
  • Error correcting code approaches
  • Tree-structured approaches
  • Experiments (Compare five ECC approaches).

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
Observations
  • In nearly all cases, the results of compared
    methods are very close
  • In majority of experiments, 0 is in the
    confidence interval, meaning the classifiers are
    not statistically different

27
Implementations in R
  • e1071 one-against-one (LIBSVM)
  • kernlab one-against-one, Crammer Singer,
    Weston Watkins
  • klaR one-against-rest (SVMlight)
  • marginTree
  • http//www-stat.stanford.edu/tibs/marginTree_1.0
    0.zip
Write a Comment
User Comments (0)
About PowerShow.com