Title: Machine Learning and Data Mining: A Math Programming-Based Approach
1 Machine Learning and Data Mining A Math
Programming-Based Approach
CS412 April 10, 2003 Madison, Wisconsin
2What is a Support Vector Machine?
- An optimally defined surface
- Typically nonlinear in the input space
- Linear in a higher dimensional space
- Implicitly defined by a kernel function
3What are Support Vector Machines Used For?
- Classification
- Regression Data Fitting
- Supervised Unsupervised Learning
(Will concentrate on classification)
4Geometry of the Classification Problem2-Category
Linearly Separable Case
A
A-
5Support Vector MachinesMaximizing the Margin
between Bounding Planes
A
A-
6Algebra of the Classification Problem 2-Category
Linearly Separable Case
- Given m points in n dimensional space
- Represented by an m-by-n matrix A
7Support Vector MachinesQuadratic Programming
Formulation
- Solve the following quadratic program
8(No Transcript)
9Checkerboard Polynomial Kernel ClassifierBest
Previous Result Kaufman 1998
10(No Transcript)
11Gaussian Kernel PSVM Classifier Spiral Dataset
94 Red Dots 94 White Dots
12The Federalist Papers
- Written in 1787-1788 by Alexander Hamilton, John
Jay and James Madison to persuade the citizens of
New York to ratify the constitution. - Papers consisted of short essays, 900 to 3500
words in length. - Authorship of 12 of those papers have been in
dispute ( Madison or Hamilton). These papers are
referred to as the disputed Federalist papers.
13Previous Work
- Mosteller and Wallace (1964)
- Using statistical inference, determined the
authorship of the 12 disputed papers. - Bosch and Smith (1998).
- Using linear programming techniques and the
evaluation of every possible combination of one,
two and three features, obtained a separating
hyperplane using only three words.
14Description of the data
- For every paper
- Machine readable text was created using a
scanner. - Computed relative frequencies of 70 words, that
Mosteller-Wallace identified as good candidates
for author-attribution studies. - Each document is represented as a vector
containing the 70 real numbers corresponding to
the 70 word frequencies. - The dataset consists of 118 papers
- 50 Madison papers
- 56 Hamilton papers
- 12 disputed papers
15Function Words Based on Relative Frequencies
16SLA Feature Selection for Classifying the
Disputed Federalist Papers
- Apply the successive linearization algorithm to
- Train on the 106 Federalist papers with known
authors - Find a classification hyperplane that uses as few
words as possible - Use the hyperplane to classify the 12 disputed
papers
17Hyperplane Classifier Using 3 Words
- A hyperplane depending on three words was found
- 0.5368to24.6634upon2.9532would66.6159
- All disputed papers ended up on the Madison side
of the plane -
18Results 3d plot of resulting hyperplane
19Comparison with Previous Work Conclusion
- Bosch and Smith (1998) calculated all the
possible sets of one, two and three words to find
a separating hyperplane. They solved 118,895
linear programs. - Our SLA algorithm for feature selection required
the solution of only 6 linear programs. - Our classification of the disputed Federalist
papers agrees with that of Mosteller-Wallace and
Bosch-Smith.
20Breast Cancer Diagnosis Application97 Tenfold
Cross Validation Correctness780 Samples494
Benign, 286 Malignant
21Detection of Alternative RNA Isoforms via
DATAS (Levels of mRNA that Correlate with
Sensitivity to Chemotherapy)
22Breast Cancer Treatment ResponseJoint with
ExonHit ( French BioTech)http//www.exonhit.com/h
tml/company/index.htm
- 35 patients treated by a drug cocktail
- 9 partial responders 26 nonresponders
- 25 gene expression measurements made on each
patient - 1-Norm SVM classifier selected 12 out of 25
genes - Combinatorially selected 6 genes out of 12
- Separating plane obtained
- 2.7915 T11 0.13436 S24 -1.0269 U23 -2.8108 Z23
-1.8668 A19 -1.5177 X05 2899.1 0. - Leave-one-out-error 1 out of 35 (97.1
correctness)
23More on SVMs
- My future job
- Siemens, Medical solutions
- My web page
- www.cs.wisc.edu/gfung
- Olvi Mangasarian web page
-
- www.cs.wisc.edu/olvi