TMpro: Transmembrane Helix Prediction using Amino Acid Properties and Latent Semantic Analysis PowerPoint PPT Presentation

presentation player overlay
1 / 22
About This Presentation
Transcript and Presenter's Notes

Title: TMpro: Transmembrane Helix Prediction using Amino Acid Properties and Latent Semantic Analysis


1
TMproTransmembrane Helix Prediction using
Amino Acid Properties and Latent Semantic
Analysis
  • Madhavi Ganapathiraju, N. Balakrishnan, Raj
    Reddy and Judith Klein-Seetharaman
  • Carnegie Mellon University

6th International Conference on Bioinformatics,
Hong Kong, PR China,August 29th, 2007
2
Outline
  • Introduction
  • Membrane proteins
  • Transmembrane helix prediction
  • Previous methods
  • Drawbacks
  • Amino acid properties
  • Approach
  • Algorithm
  • Features and models
  • Evaluations
  • Web server

Introduction
Properties
Approach
Algorithm
Web Server
Previous Methods
3
Membrane Proteins
Embedded in the cell / organelle membrane
Membrane Protein
Cell Membrane
Soluble Protein
  • Important class of proteins
  • Many important functions carried out by them
  • Provide access to cell for drug targeting

Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
4
Transmembrane Segment Characteristics
Cytoplasm (Aqueous medium)
  • Transmembrane
  • 30Å hydrophobic core
  • A helix has to be 19 residues long to go from one
    side to the other

Extracellular (Aqueous medium)
Side view
  • Questions to be addressed by prediction algorithm
  • How many transmembrane segments are there?
  • Where are the transmembrane locations in primary
    sequence?

Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
5
Transmembrane Helix Prediction
  • Important
  • protein family
  • structure and function
  • regions accessible from extracellular side
  • Challenges
  • Little available training data
  • Overtraining
  • Difficulty in discovery of novel architectures

Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
6
Hydrophobicity scale
Kyte-Doolittle hydrophobicity profile
KD scale, GES scale, WW scale
9 residue window average hydrophobicity
Limitations segment boundary unclear low
accuracy
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
7
Current best methods use HMMs
Hidden Markov Model Methods (TMHMM)
Potassium channel
actual
predicted
Limitations too many parameters restrictive
topology
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
8
TMpro property based algorithm for
transmembrane helix prediction
9
Opportunities for Improvement
Amino acid properties
Nonpolar residues
Charged Residues
Aromatic Residues
  • Previous methods
  • Do not employ all possible property distributions
  • Find average occurrences of amino acids

Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
10
Properties We Studied
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
11
Modified Representation of Primary Sequence
Amino Acid Property Sequences
Charge
Polarity
Aromaticity
Size
Electronic properties
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
12
Predictive Capability of Each Property
  • Adjust parameters of TMHMM (v 1.0)
  • To make it emit one of the property values
  • Properties considered
  • Polarity polar, non-polar
  • Aromaticity aromatic, aliphatic, neutral
  • Electronic properties strong donor, weak donor,
    neutral, weak acceptor, strong acceptor

3-valued property observations achieve 91
accuracy of that of 20-valued amino acid
observation
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
13
Approach
Biology-Language Analogy
Ganapathiraju, et al (2004) LNCS 3345
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
14
Text Domain Equivalent
Documents and Words
Documents 15-residue windows
VQLAHHFSEPEITLIIFGVMAGVIGTILLISYGIRRLIKK
----ppn-n-n---- -p--pp-p----p-- -.-.RRR....-.-- OO
O.OOO.O.OOoOO
  • Words Property-values

W1 positively charged W2 polar W3 nonpolar W4
aromatic W5 aliphatic
W6 strong electron acceptor W7 strong electron
donor W8 weak electron acceptor W9 weak
electron donor W10 medium sized
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
15
Latent Semantic Analysis
Build Word-Document Matrix
Documents
Distinct features of TM and nonTM achieved
Words
Dimension 2
W USVT
For classification feature vectors SVT can be
used
Dimension 1
Reduced dimensions 4
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
16
Different Classifiers/Models
  • Support vector machines
  • Neural networks
  • Linear classifier
  • Hidden Markov modeling
  • Decision trees

Neural network with LSA features is called TMpro
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
17
Evaluations
Uses evolutionary information and many more model
parameters
Benchmark Server Resultshttp//cubic.bioc.columbi
a.edu/services/tmh_benchmark/
Evaluation on larger datasets
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
18
TMpro Web Interface
http//linzer.blm.cs.cmu.edu/tmpro/
Novel features for manual annotation
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
19
Acknowledgements
  • Co-authors
  • Judith Klein-Seetharaman
  • Raj Reddy
  • N. Balakrishnan
  • Web-site Development
  • Christopher Jon Jursa
  • Hassan A. Karimi

Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
20
  • Thank you!

21
Larger training data does not improve TMHMM
STMHMM is TMHMM trained with recent 145 TM
proteins
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
22
Performance on Recent Large Dataset
Introduction
Properties
Approach
Web Server
Previous Methods
Algorithm
Write a Comment
User Comments (0)
About PowerShow.com