Title: The Disputed Federalist Papers : SVM Feature Selection via Concave Minimization
1 The Disputed Federalist Papers SVM Feature
Selection via Concave Minimization
- Glenn Fung and Olvi L. Mangasarian
CSNA 2002 June 13-16, 2002 Madison, Wisconsin
2Outline of Talk
- Support Vector Machines (SVM) Introduction
- Standard Quadratic Programming Formulation
- Successive Linearization Algorithm (SLA)
- The Disputed Federalist Papers
- Description of the Classification Problem
- Previous Work
- Separating Hyperplane in Three Dimensions Only
- Classification Agrees with Previous Results
3What is a Support Vector Machine?
- An optimally defined surface
- Typically nonlinear in the input space
- Linear in a higher dimensional space
- Implicitly defined by a kernel function
4What are Support Vector Machines Used For?
- Classification
- Regression Data Fitting
- Supervised Unsupervised Learning
(Will concentrate on classification)
5Geometry of the Classification Problem2-Category
Linearly Separable Case
A
A-
6Algebra of the Classification Problem 2-Category
Linearly Separable Case
- Given m points in n dimensional space
- Represented by an m-by-n matrix A
7Support Vector MachinesMaximizing the Margin
between Bounding Planes
A
A-
8Support Vector MachinesQuadratic Programming
Formulation
- Solve the following quadratic program
9Support Vector Machines Linear Programming
Formulation
- Use the 1-norm instead of the 2-norm
- This is equivalent to the following linear
program
10Feature Selection and SVMs
- Use the step function to suppress components of
the - normal to the separating hyperplane
11Smooth Approximation of the Step Function
12SVM Formulation with Feature Selection
13Successive Linearization Algorithm (SLA) for
Feature Selection
- Proposition Algorithm terminates in a finite
number - of steps (typically 5 to 7) at a stationary point.
14The Federalist Papers
- Written in 1787-1788 by Alexander Hamilton, John
Jay and James Madison to persuade the citizens of
New York to ratify the constitution. - Papers consisted of short essays, 900 to 3500
words in length. - Authorship of 12 of those papers have been in
dispute ( Madison or Hamilton). These papers are
referred to as the disputed Federalist papers.
15Previous Work
- Mosteller and Wallace (1964)
- Using statistical inference, determined the
authorship of the 12 disputed papers. - Bosch and Smith (1998).
- Using linear programming techniques and the
evaluation of every possible combination of one,
two and three features, obtained a separating
hyperplane using only three words.
16Description of the data
- For every paper
- Machine readable text was created using a
scanner. - Computed relative frequencies of 70 words, that
Mosteller-Wallace identified as good candidates
for author-attribution studies. - Each document is represented as a vector
containing the 70 real numbers corresponding to
the 70 word frequencies. - The dataset consists of 118 papers
- 50 Madison papers
- 56 Hamilton papers
- 12 disputed papers
17Function Words Based on Relative Frequencies
18SLA Feature Selection for Classifying the
Disputed Federalist Papers
- Apply the successive linearization algorithm to
- Train on the 106 Federalist papers with known
authors - Find a classification hyperplane that uses as few
words as possible - Use the hyperplane to classify the 12 disputed
papers
19Hyperplane Classifier Using 3 Words
- A hyperplane depending on three words was found
- 0.5368to24.6634upon2.9532would66.6159
- All disputed papers ended up on the Madison side
of the plane -
20Results 3d plot of resulting hyperplane
21Comparison with Previous Work Conclusion
- Bosch and Smith (1998) calculated all the
possible sets of one, two and three words to find
a separating hyperplane. They solved 118,895
linear programs. - Our SLA algorithm for feature selection required
the solution of only 6 linear programs. - Our classification of the disputed Federalist
papers agrees with that of Mosteller-Wallace and
Bosch-Smith.
22More on SVMs
- My web page
- www.cs.wisc.edu/gfung
- Olvi Mangasarian web page
-
- www.cs.wisc.edu/olvi