Title: TM PRO
1TM PRO Comparison of Algorithms for Protein
Stability Prediction Upon Mutations
- Madhavi Ganapathiraju
- Graduate student
- Carnegie Mellon University
2Overview
- TMpro evaluations on PDBTM, TMPDB and MPTOPO are
complete - Additional inputs to TMPro are being studied
- Yule values (not successful)
- Evolutionary Profile (promising)
- TMPro website has been completed
- Evaluation of algorithms to predict protein
stability changes upon mutations
3Part 1 TM pro
4TMPro Evaluations
Segment Segment Segment Segment Segment Segment Residuelevel Residuelevel
Method ? Qok Qok Qok Segment F Score Segment F Score Segment Recall Segment Recall Segment Precision Segment Precision Q2 Q2 Misclassified as Soluble Misclassified as Soluble
MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins) MPtopo (101 TM proteins)
2a TMHMM 66 91 89 94 84 5
2b TMpro NN 60 ? ? 93 ? 92 ? 94 79 ? 0 ?
PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins) PDBTM (191 TM proteins)
3a TMHMM 68 90 90 89 90 84 84 13
3b TMpro NN 57 ? 93 93 ? 93 ? 93 ? 81 ? 2 ?
5TMPro web-server is fully functional!Competiti
on for TMpro LogoPrize See your logo on
the web!
6Attempts to overcome confusion with globular
soluble helices (1)
- Yule value features to be added
- Yule value features that discriminate amino acid
neighbor propensities between TM and nonTM
helices were computed earlier - Tried to add these features as input to NN
predictor, but could not achieve quantitative
improvement - I will discuss this in future when I have any
results to present
7Attempts to overcome confusion with globular
soluble helices (2)
- Evolutionary profile information
- It is known that knowledge of evolutionary
profile of a protein can improve prediction
accuracy to a great extent - TMPro is capable of predicting TMs without
requiring knowledge of profile - Useful when you cannot extract sequence
alignments from known proteins - But where profile is known, we would like to use
that additional information
8Profile generation
Those of you who have worked with evolutionary
analysis before, please give feedback
- Get multiple sequence alignments
- Compute position specific scoring matrix for each
protein - 21 rows (20 amino acids, and 1 row for gaps)
- Profile is generated for each protein in the
training and test sets
9Doubts
What labels to assign to gaps?
- We have labels for training sequences
- But when original sequence has gaps when aligned,
how to interpret the labels of the gaps?
Even TM regions are having gaps such as shown
above
10Doubts
What do with missing segment info for some
sequences
- When nothing is shown (gap/alignment) for some
sequences, I am counting those as gaps
11Using profile for prediction
Studied independent of TMpro Neural network with
21 input, 21 hidden and 1 output neurons
12Another output
13NN architecture needs to be modified
But instead I did post-processing of Neural
network output
14Some more wavelet outputs
Note that these are from the training data
itself.. Yet to check how it performs overall
15Part 2 Stability upon Mutations
16Evaluation of predictions of protein stability
changes upon mutations
- Effects of mutations on 2 TM proteins are
available in our group - The two proteins are rhodopsin and
bacteriorhodopsin - Data available for how much mis-folding occurs
- How stability of protein is affected
- There are algorithms that can also predict these
changes - We compared how accurate or reliable the
prediction methods are, by comparing their
results with our experimental data
173 Prediction algorithms
- I mutant 2.0
- Support vector machine
- Features amino acid neighbors in 9nm sphere,
temperature, pH, relative solvent accessibility
surface are - http//gpcr2.biocomp.unibo.it/cgi/predictors/I-Mut
ant2.0/I-Mutant2.0.cgi - DFIRE
- Knowledge based statistical potentials
- http//phyyz4.med.buffalo.edu/hzhou/mutation.html
- FOLDX
- Statistical mechanics.. Account for various
energy terms - http//fold-x.embl-heidelberg.de1100/
18Authors claims in 3 papers
19Our results
Rhodopsin (PDB 1U19)
Bacteriorhodopsin (PDB 1QM8)
20Bias in of mutations that increase/decrease
stability
Database bias affects apparent accuracies of
algorithms I-mutant for example, predicts
decrease in stability for a majority of the
mutations. Whether the mutations studied through
experiments preserve the natural bias of
decreasing stability mutations, affects the
apparent accuracy of the prediction algorithms
21Correlation with known data
Reported correlations for these methods are quite
large (gt0.7) On data compared here the
correlations are quite low
22Notes ..
- Local installation of blast and netblast are on
cologne - /usr1/blast-2.2.13/
- /usr1/netblast-2.2.13/
- Java SDK on Cologne
- /usr1/j2sdk1.4.2_11/
23Acknowledgements
- Judith Klein-Seetharaman
- Christopher Jon Jursa
- Pitt Information sciences
- (for developing web interface)