Classification and Feature Selection Algorithms for Multi-class CGH data presentation

About This Presentation

Transcript and Presenter's Notes

Title: Classification and Feature Selection Algorithms for Multi-class CGH data

1
Classification and Feature Selection Algorithms
for Multi-class CGH data

Jun Liu, Sanjay Ranka, Tamer Kahveci
http//www.cise.ufl.edu

2
Gene copy number

The number of copies of genes can vary from
person to person.
0.4 of the gene copy numbers are different for
pairs of people.
Variations in copy numbers can alter resistance
to disease
EGFR copy number can be higher than normal in
Non-small cell lung cancer.

Lung images (ALA)
Cancer
Healthy
3
Comparative Genomic Hybridization (CGH)
4
Raw and smoothed CGH data
5
Example CGH dataset
862 genomic intervals in the Progenetix database
6
Problem description

Given a new sample, which class does this sample
belong to?
Which features should we use to make this
decision?

7
Outline

Support Vector Machine (SVM)
SVM for CGH data
Maximum Influence Feature Selection algorithm
Results

8
SVM in a nutshell
Support Vector Machine (SVM) SVM for CGH
data Maximum Influence Feature Selection
algorithm Results
9
Classification with SVM

Consider a two-class, linearly separable
classification problem
Many decision boundaries!
The decision boundary should be as far away from
the data of both classes as possible
We should maximize the margin, m

Class 2
m
Class 1
10
SVM Formulation

Let x1, ..., xn be our data set and let yi Î
1,-1 be the class label of xi
Maximize J over ai

The decision boundary can be constructed as

11
SVM for CGH data
Support Vector Machine (SVM) SVM for CGH
data Maximum Influence Feature Selection
algorithm Results
12
Pairwise similarity measures

Raw measure
Count the number of genomic intervals that both
samples have gain (or loss) at that position.

Raw 3
13
SVM based on Raw kernel

Using SVM with the Raw kernel amounts to solving
the following quadratic program
The resulting decision function is

Maximize J over ai
Use Raw kernel to replace
Use Raw kernel to replace
Is this cool?
14
Is Raw kernel valid?

Not all similarity function can serve as kernel.
This requires the underlying kernel matrix M is
positive semi-definite.
M is positive semi-definite if for all vectors v,
vTMv 0

15
Is Raw kernel valid?

Proof define a function F() where
F a ?1, 0, -1m ? b ?1, 02m,where
F(gain) F(1) 01
F(no-change) F(0) 00
F(loss) F(-1) 10
Raw(X, Y) F(X)T F(Y)

X 0 1 1 0 1 -1 Y 0 1 0 -1 -1 -1

Raw(X, Y) 2
F(X)T F(Y) 2
16
Raw Kernel is valid!

Raw kernel can be written as Raw(X, Y) F(X)T
F(Y)
Define a 2m by n matrix
Therefore,

Let M denote the Kernel matrix of Raw
17
MIFS algorithm
Support Vector Machine (SVM) SVM for CGH
data Maximum Influence Feature Selection
algorithm Results
18
MIFS for multi-class data
One-versus-all SVM
19
Results
Support Vector Machine (SVM) SVM for CGH
data Maximum Influence Feature Selection
algorithm Results
20
Dataset Details
Data taken from Progenetix database
21
Datasets
Similarity level Similarity level Similarity level Similarity level
cancers best good fair poor
2 478 466 351 373
4 1160 790 800 800
6 1100 850 880 810
8 1000 830 750 760
Dataset size
22
Experimental results

Comparison of linear and Raw kernel

On average, Raw kernel improves the predictive
accuracy by 6.4 over sixteen datasets compared
to linear kernel.
23
Experimental results
Accuracy
Using 80 features results in accuracy that is
comparable or better than using all features
Using 40 features results in accuracy that is
comparable to using all features
Number of Features
(Fu and Fu-Liu, 2005)
(Ding and Peng, 2005)
24
Using MIFS for feature selection

Result to test the hypothesis that 40 features
are enough and 80 features are better

25
A Web Server for Mining CGH Data
http//cghmine.cise.ufl.edu8007/CGH/Default.html
26
Thank you
27
Appendix
28
Minimum Redundancy and Maximum Relevance (MRMR)

Relevance V is defined as the average mutual
information between features and class labels
Redundancy W is defined as the average mutual
information between all pairs of features
Incrementally select features by maximizing (V /
W) or (V W)

Y
1
X
0
1
29
Support Vector Machine Recursive Feature
Elimination (SVM-RFE)
Train a linear SVM based on feature set
Compute the weight vector
Compute the ranking coefficient wi2 for the ith
feature
Remove the feature with smallest ranking
coefficient
Is feature set empty?
N
Y
30
Pairwise similarity measures

Sim measure
Segment is a contiguous block of aberrations of
the same type.
Count the number of overlapping segment pairs.

Sim 2
31
Non-linear Decision Boundary

How to generalize SVM when the two class
classification problem is not linearly separable?
Key idea transform xi to a higher dimensional
space to make life easier
Input space the space the point xi are located
Feature space the space of f(xi) after
transformation

Input space

Write a Comment

User Comments (0)

About PowerShow.com

Classification and Feature Selection Algorithms for Multi-class CGH data PowerPoint PPT Presentation