CISC 667 Intro to Bioinformatics Fall 2005 Support Vector Machines II - PowerPoint PPT Presentation

About This Presentation
Title:

CISC 667 Intro to Bioinformatics Fall 2005 Support Vector Machines II

Description:

under a curve the plots true positives as a function of false positives ... Tree Kernel (Vert, 2002) For a phylogenetic profile x and an evolution pattern e: ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 24
Provided by: lil3
Category:

less

Transcript and Presenter's Notes

Title: CISC 667 Intro to Bioinformatics Fall 2005 Support Vector Machines II


1
CISC 667 Intro to Bioinformatics(Fall
2005)Support Vector Machines (II)
  • Bioinformatics Applications

2
(No Transcript)
3
(No Transcript)
4
(No Transcript)
5
Combining pairwise similarity with SVMs for
protein homology detection
6
Experiment known protein families
Jaakkola, Diekhans and Haussler 1999
7
Vectorization
8
(No Transcript)
9
A measure of sensitivity and specificity
5
ROC 1
6
ROC 0.67
ROC 0
ROC receiver operating characteristic score is
the normalized area under a curve the plots true
positives as a function of false positives
10
Performance Comparison (1)
11
(No Transcript)
12
Using Phylogenetic Profiles SVMs
YAL001C E-value
Phylogenetic
profile 0.122 1 1.064 0 3.589 0 0.008 1 0.
692 1 8.49 0 14.79 0 0.584 1 1.567 0 0.3
24 1 0.002 1 3.456 0 2.135 0 0.142 1 0.0
01 1 0.112 1 1.274 0 0.234 1 4.562 0 3.9
34 0 0.489 1 0.002 1 2.421 0 0.112 1
13
phylogenetic profiles and Evolution Patterns
1
1
1
1
0
1
0
1 1 0 1 0 0 0 1 1 0
x

Impossible to know for sure if the gene followed
exactly this
evolution pattern
14
Tree Kernel (Vert, 2002)
  • For a phylogenetic profile x and an evolution
    pattern e
  • P(e) quantifies how natural the pattern is
  • P(xe) quantifies how likely the pattern e is the
    true history of the profile x
  • Tree Kernel
  • K tree(x,y) Se
    p(e)p(xe)p(ye)
  • Can be proved to be a kernel
  • Intuition two profiles get closer in the feature
    space when they have shared common evolution
    patterns with high probability.

15
Tree-Encoded Profile (Narra Liao, 2004)
0.55
0.34
Post-order traversal
0.75
0.67
1
0.33
0.5
1 0.33 0.67 0.34 0.5 0.75 0.55
1 1 0 1 0 0 0 1 1
16
(No Transcript)
17
Using Support Vector Machines
18
Kernel function
where r 0.10
Soft margin regularization C 1.50
L(?) ? ?i ? ½ ? ?i ?j yi yj (K(xi xj) ?ij
/C)
Coding scheme BIN21
Evaluation Q3 (P1P2P3)/N C (TP?TN
- FP ?FN) / ?( PP ?PN ?AP ?AN) SOV segment
overlap accuracy
19
Design tertiary classifiers
20
(No Transcript)
21
Nguyen Rajapakse, Genome Informatics 14
218-227 (2003)
22
A two-stage SVM
23
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com