The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization - PowerPoint PPT Presentation

About This Presentation
Title:

The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization

Description:

Smooth Approximation of the Step Function. SVM Formulation with Feature Selection. For , we use the approximation of the step. vector by the concave exponential: ... – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 23
Provided by: olvilman9
Category:

less

Transcript and Presenter's Notes

Title: The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization


1
The Disputed Federalist Papers SVM Feature
Selection via Concave Minimization
  • Glenn Fung and Olvi L. Mangasarian

CSNA 2002 June 13-16, 2002 Madison, Wisconsin
2
Outline of Talk
  • Support Vector Machines (SVM) Introduction
  • Standard Quadratic Programming Formulation
  • 1-norm Linear SVMs
  • SVM Feature Selection
  • Successive Linearization Algorithm (SLA)
  • The Disputed Federalist Papers
  • Description of the Classification Problem
  • Previous Work
  • Results
  • Separating Hyperplane in Three Dimensions Only
  • Classification Agrees with Previous Results

3
What is a Support Vector Machine?
  • An optimally defined surface
  • Typically nonlinear in the input space
  • Linear in a higher dimensional space
  • Implicitly defined by a kernel function

4
What are Support Vector Machines Used For?
  • Classification
  • Regression Data Fitting
  • Supervised Unsupervised Learning

(Will concentrate on classification)
5
Geometry of the Classification Problem2-Category
Linearly Separable Case
A
A-
6
Algebra of the Classification Problem 2-Category
Linearly Separable Case
  • Given m points in n dimensional space
  • Represented by an m-by-n matrix A
  • More succinctly

7
Support Vector MachinesMaximizing the Margin
between Bounding Planes
A
A-
8
Support Vector MachinesQuadratic Programming
Formulation
  • Solve the following quadratic program

9
Support Vector Machines Linear Programming
Formulation
  • Use the 1-norm instead of the 2-norm
  • This is equivalent to the following linear
    program

10
Feature Selection and SVMs
  • Use the step function to suppress components of
    the
  • normal to the separating hyperplane

11
Smooth Approximation of the Step Function
12
SVM Formulation with Feature Selection
13
Successive Linearization Algorithm (SLA) for
Feature Selection
  • Proposition Algorithm terminates in a finite
    number
  • of steps (typically 5 to 7) at a stationary point.

14
The Federalist Papers
  • Written in 1787-1788 by Alexander Hamilton, John
    Jay and James Madison to persuade the citizens of
    New York to ratify the constitution.
  • Papers consisted of short essays, 900 to 3500
    words in length.
  • Authorship of 12 of those papers have been in
    dispute ( Madison or Hamilton). These papers are
    referred to as the disputed Federalist papers.

15
Previous Work
  • Mosteller and Wallace (1964)
  • Using statistical inference, determined the
    authorship of the 12 disputed papers.
  • Bosch and Smith (1998).
  • Using linear programming techniques and the
    evaluation of every possible combination of one,
    two and three features, obtained a separating
    hyperplane using only three words.

16
Description of the data
  • For every paper
  • Machine readable text was created using a
    scanner.
  • Computed relative frequencies of 70 words, that
    Mosteller-Wallace identified as good candidates
    for author-attribution studies.
  • Each document is represented as a vector
    containing the 70 real numbers corresponding to
    the 70 word frequencies.
  • The dataset consists of 118 papers
  • 50 Madison papers
  • 56 Hamilton papers
  • 12 disputed papers

17
Function Words Based on Relative Frequencies
18
SLA Feature Selection for Classifying the
Disputed Federalist Papers
  • Apply the successive linearization algorithm to
  • Train on the 106 Federalist papers with known
    authors
  • Find a classification hyperplane that uses as few
    words as possible
  • Use the hyperplane to classify the 12 disputed
    papers

19
Hyperplane Classifier Using 3 Words
  • A hyperplane depending on three words was found
  • 0.5368to24.6634upon2.9532would66.6159
  • All disputed papers ended up on the Madison side
    of the plane

20
Results 3d plot of resulting hyperplane
21
Comparison with Previous Work Conclusion
  • Bosch and Smith (1998) calculated all the
    possible sets of one, two and three words to find
    a separating hyperplane. They solved 118,895
    linear programs.
  • Our SLA algorithm for feature selection required
    the solution of only 6 linear programs.
  • Our classification of the disputed Federalist
    papers agrees with that of Mosteller-Wallace and
    Bosch-Smith.

22
More on SVMs
  • My web page
  • www.cs.wisc.edu/gfung
  • Olvi Mangasarian web page
  • www.cs.wisc.edu/olvi
Write a Comment
User Comments (0)
About PowerShow.com