Machine Learning and Data Mining: A Math Programming-Based Approach - PowerPoint PPT Presentation

About This Presentation
Title:

Machine Learning and Data Mining: A Math Programming-Based Approach

Description:

... all the possible sets of one, two and three words to find a separating hyperplane. ... Our SLA algorithm for feature selection required the solution of only ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 24
Provided by: olvilman9
Category:

less

Transcript and Presenter's Notes

Title: Machine Learning and Data Mining: A Math Programming-Based Approach


1
Machine Learning and Data Mining A Math
Programming-Based Approach
  • Glenn Fung

CS412 April 10, 2003 Madison, Wisconsin
2
What is a Support Vector Machine?
  • An optimally defined surface
  • Typically nonlinear in the input space
  • Linear in a higher dimensional space
  • Implicitly defined by a kernel function

3
What are Support Vector Machines Used For?
  • Classification
  • Regression Data Fitting
  • Supervised Unsupervised Learning

(Will concentrate on classification)
4
Geometry of the Classification Problem2-Category
Linearly Separable Case
A
A-
5
Support Vector MachinesMaximizing the Margin
between Bounding Planes
A
A-
6
Algebra of the Classification Problem 2-Category
Linearly Separable Case
  • Given m points in n dimensional space
  • Represented by an m-by-n matrix A
  • More succinctly

7
Support Vector MachinesQuadratic Programming
Formulation
  • Solve the following quadratic program

8
(No Transcript)
9
Checkerboard Polynomial Kernel ClassifierBest
Previous Result Kaufman 1998
10
(No Transcript)
11
Gaussian Kernel PSVM Classifier Spiral Dataset
94 Red Dots 94 White Dots
12
The Federalist Papers
  • Written in 1787-1788 by Alexander Hamilton, John
    Jay and James Madison to persuade the citizens of
    New York to ratify the constitution.
  • Papers consisted of short essays, 900 to 3500
    words in length.
  • Authorship of 12 of those papers have been in
    dispute ( Madison or Hamilton). These papers are
    referred to as the disputed Federalist papers.

13
Previous Work
  • Mosteller and Wallace (1964)
  • Using statistical inference, determined the
    authorship of the 12 disputed papers.
  • Bosch and Smith (1998).
  • Using linear programming techniques and the
    evaluation of every possible combination of one,
    two and three features, obtained a separating
    hyperplane using only three words.

14
Description of the data
  • For every paper
  • Machine readable text was created using a
    scanner.
  • Computed relative frequencies of 70 words, that
    Mosteller-Wallace identified as good candidates
    for author-attribution studies.
  • Each document is represented as a vector
    containing the 70 real numbers corresponding to
    the 70 word frequencies.
  • The dataset consists of 118 papers
  • 50 Madison papers
  • 56 Hamilton papers
  • 12 disputed papers

15
Function Words Based on Relative Frequencies
16
SLA Feature Selection for Classifying the
Disputed Federalist Papers
  • Apply the successive linearization algorithm to
  • Train on the 106 Federalist papers with known
    authors
  • Find a classification hyperplane that uses as few
    words as possible
  • Use the hyperplane to classify the 12 disputed
    papers

17
Hyperplane Classifier Using 3 Words
  • A hyperplane depending on three words was found
  • 0.5368to24.6634upon2.9532would66.6159
  • All disputed papers ended up on the Madison side
    of the plane

18
Results 3d plot of resulting hyperplane
19
Comparison with Previous Work Conclusion
  • Bosch and Smith (1998) calculated all the
    possible sets of one, two and three words to find
    a separating hyperplane. They solved 118,895
    linear programs.
  • Our SLA algorithm for feature selection required
    the solution of only 6 linear programs.
  • Our classification of the disputed Federalist
    papers agrees with that of Mosteller-Wallace and
    Bosch-Smith.

20
Breast Cancer Diagnosis Application97 Tenfold
Cross Validation Correctness780 Samples494
Benign, 286 Malignant
21
Detection of Alternative RNA Isoforms via
DATAS (Levels of mRNA that Correlate with
Sensitivity to Chemotherapy)
22
Breast Cancer Treatment ResponseJoint with
ExonHit ( French BioTech)http//www.exonhit.com/h
tml/company/index.htm
  • 35 patients treated by a drug cocktail
  • 9 partial responders 26 nonresponders
  • 25 gene expression measurements made on each
    patient
  • 1-Norm SVM classifier selected 12 out of 25
    genes
  • Combinatorially selected 6 genes out of 12
  • Separating plane obtained
  • 2.7915 T11 0.13436 S24 -1.0269 U23 -2.8108 Z23
    -1.8668 A19 -1.5177 X05 2899.1 0.
  • Leave-one-out-error 1 out of 35 (97.1
    correctness)

23
More on SVMs
  • My future job
  • Siemens, Medical solutions
  • My web page
  • www.cs.wisc.edu/gfung
  • Olvi Mangasarian web page
  • www.cs.wisc.edu/olvi
Write a Comment
User Comments (0)
About PowerShow.com