Title: CISC 667 Intro to Bioinformatics Fall 2005 Support Vector Machines II
1CISC 667 Intro to Bioinformatics(Fall
2005)Support Vector Machines (II)
- Bioinformatics Applications
2(No Transcript)
3(No Transcript)
4(No Transcript)
5Combining pairwise similarity with SVMs for
protein homology detection
6Experiment known protein families
Jaakkola, Diekhans and Haussler 1999
7Vectorization
8(No Transcript)
9A measure of sensitivity and specificity
5
ROC 1
6
ROC 0.67
ROC 0
ROC receiver operating characteristic score is
the normalized area under a curve the plots true
positives as a function of false positives
10Performance Comparison (1)
11(No Transcript)
12Using Phylogenetic Profiles SVMs
YAL001C E-value
Phylogenetic
profile 0.122 1 1.064 0 3.589 0 0.008 1 0.
692 1 8.49 0 14.79 0 0.584 1 1.567 0 0.3
24 1 0.002 1 3.456 0 2.135 0 0.142 1 0.0
01 1 0.112 1 1.274 0 0.234 1 4.562 0 3.9
34 0 0.489 1 0.002 1 2.421 0 0.112 1
13phylogenetic profiles and Evolution Patterns
1
1
1
1
0
1
0
1 1 0 1 0 0 0 1 1 0
x
Impossible to know for sure if the gene followed
exactly this
evolution pattern
14Tree Kernel (Vert, 2002)
- For a phylogenetic profile x and an evolution
pattern e - P(e) quantifies how natural the pattern is
- P(xe) quantifies how likely the pattern e is the
true history of the profile x - Tree Kernel
- K tree(x,y) Se
p(e)p(xe)p(ye) - Can be proved to be a kernel
- Intuition two profiles get closer in the feature
space when they have shared common evolution
patterns with high probability.
15Tree-Encoded Profile (Narra Liao, 2004)
0.55
0.34
Post-order traversal
0.75
0.67
1
0.33
0.5
1 0.33 0.67 0.34 0.5 0.75 0.55
1 1 0 1 0 0 0 1 1
16(No Transcript)
17Using Support Vector Machines
18Kernel function
where r 0.10
Soft margin regularization C 1.50
L(?) ? ?i ? ½ ? ?i ?j yi yj (K(xi xj) ?ij
/C)
Coding scheme BIN21
Evaluation Q3 (P1P2P3)/N C (TP?TN
- FP ?FN) / ?( PP ?PN ?AP ?AN) SOV segment
overlap accuracy
19Design tertiary classifiers
20(No Transcript)
21Nguyen Rajapakse, Genome Informatics 14
218-227 (2003)
22A two-stage SVM
23(No Transcript)