A Comparative Study of Kernel Methods for Classification Applications - PowerPoint PPT Presentation

1 / 9
About This Presentation
Title:

A Comparative Study of Kernel Methods for Classification Applications

Description:

A Comparative Study of Kernel Methods for Classification Applications. Yan Liu. Sep 23, 2003 ... Similar idea as using HMM and SVM together for protein classification ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 10
Provided by: yan48
Category:

less

Transcript and Presenter's Notes

Title: A Comparative Study of Kernel Methods for Classification Applications


1
A Comparative Study of Kernel Methods for
Classification Applications
  • Yan Liu
  • Sep 23, 2003

2
Introduction
  • Support Vector Machines
  • Text classification
  • Protein classification
  • Various kernels
  • Standard kernels
  • Linear kernels, polynomial kernels, RBF kernels
  • Other application-oriented kernels
  • Fisher-kernels, String kernels and etc

3
Problem Definition
  • There has been little study focusing on the
    behaviors of different kernels for
  • Rare-class problem (unbalanced data)
  • Noisy data problem
  • Multi-label problem
  • These problems are common in the real
    applications
  • Text classification
  • Protein Family classification

4
Text Classification
  • Kernel selection
  • Linear kernels
  • String kernels
  • Problem Focus
  • Rare-class problem
  • Multi-class problem
  • Dataset
  • Reuters21578 dataset

5
Protein Family Classification
  • Kernel selection
  • Linear kernels
  • String kernels
  • Fisher-kernels
  • Problem Focus
  • Rare-class problem
  • Noisy data problem
  • Dataset
  • GPCR classification dataset

6
Methodology and Schedule
  • Propose conjectures on the possible behaviors
    according to analysis
  • Sep 12th Sep 28th
  • Work on synthetic datasets to testify hypothesis
  • Sep 28th Oct 20th
  • Map from synthetic data to real application data
  • Oct 20th Sep 18th

7
Mid-course Deliverables
  • Analysis of the dataset
  • Class distribution (rare-class and multi-class)
  • Noise level
  • Conjectures for possible behaviors
  • Results on synthetic datasets
  • Explanation and interesting observations from the
    results

8
Multi-label Problem for Text Classification
  • Related work
  • Binary classification (one-vs-all) (by Yang
    Joachims)
  • Mixture Model by EM (by McCallum)
  • Rank-based approach
  • Boosting (by Schapire Singer)
  • Rank-based kernels (by Elsseeff Weston)

9
Multi-label Problem for Text Classification
  • Possible Solutions
  • Combine Mixture Model and Kernel-based approach
    using Fisher-kernels
  • Similar idea as using HMM and SVM together for
    protein classification
Write a Comment
User Comments (0)
About PowerShow.com