Title: Computational Biology - Bioinformatik
1V9 orientation of TM helices
- Modelling 3D structures of helical TM bundles
- Park, Staritzbichler, Elsner Helms, Proteins
(2004), Park Helms, Proteins (2006) - Beuming Weinstein (2004)
- T. Beming H. Weinstein (2004) Bioinformatics
20, 1822 - Adamian Liang (2006)
- L. Adamian J. Liang (2006) BMC Struct. Biol.
6, 13 - TMX predict lipid-accessible sides of TM helices
from sequence - Park Helms, Bioinformatics (2007), Park, Hayat
Helms, BMC Bioinformatics (2007),
2Structure modelling for helical membrane proteins
gtP52202 RHO -- Rhodopsin. MNGTEGPDFYIPFSNKTGVVRSPF
EYPQYYLAEPWKYSALAAYMFMLIILGFPINFLTLYVTVQHKKLRSPLNY
ILLNLAVADLFMVLGGFTTTLYTSMNGYFVFGVTGCYFEGFFATLGGEVA
LWCLVVLAIERYIVVCKPMSNFRFGENHAIMGVVFTWIMALTCAAPPLVG
WSRYIPEGMQCSCGVDYYTLKPEVNNESFVIYMFVVHFAIPLAVIFFCYG
RLVCTVKEAAAQQQESATTQKAEKEVTRMVIIMVVSFLICWVPYASVAFY
IFSNQGSDFGPVFMTIPAFFAKSSAIYNPVIYIVMNKQFRNCMITT
LCCGKNPLGDDETATGSKTETSSVSTSQVSPA
1D
2D
www.gpcr.org
3D
EMBO Reports (2002)
3Design helical bundles using effective energy
functions
Aim assemble TM bundles Glycophorin A dimer,
Erb/Neu dimer, phospholamban pentamer Method
scan 6-D conformational space of dimers of ideal
helices
4docking of helix-dimers energy scoring
Example for parametrised energy function between
2 residues
search 5 degrees of freedom systematically. score
conformations by residue-residue energy
function.
Park et al. Proteins (2004)
5docking of helix-dimers
Test for Glycophorin A, dimer of two identical
helices, NMR structure available
- Energy landscape
- around the minimum
- Minimum is truly
- global minimum.
RMSD between best model and NMR structure only
0.8 Å.
Park et al. Proteins (2004)
However, this is not the case for dimers in
larger TMH proteins.
6Need more/other information to orient helices
- Early suggestion TM proteins are inside-out
proteins. - That means that are hydrophobic outside and
hydrophilic inside. - compute hydrophobic moment the direction of
largest hydrophobicity
here, rproj(i) is the projection of the
side-chain onto the helical axis, i.e. the vector
difference describes the shortest distance
between residue i and the helix axis. H(i) is the
hydrophobicity of residue i. This method was
introduced by David Eisenberg (1982, Nature)
6
7role of hydrophobic moment
According to the concept of Eisenberg, all
helices would orient their most hydrophobic side
towards the bilayer. However, this measure is
quite unprecise (Park Helms, Biopolymers 2006).
Hydrophobicity scales ww Wimley-White scale eis
Eisenberg scale ges Goldman/Engelman/Steitz
scale kd Kyte-Doolittle scale Specialized
scales kP kProt bw Beuming Weinstein
scale tmlip1/2 Adamian Liang
7
8Beuming Weinstein (2004) amino acid
propensities
- Hydrophobic residues (A, I, L, V) make up 48.7
of all residues in TM proteins - Charged residues (D, E, H, K, R) constitute only
5.5 - Glycine (G) is relatively abundant
- Small residues (A, C, S, T) form 30.6
- Aromatic residues (F,W,Y) represent 15.8
- ?-branched residues (T, I, V) form 24.9.
- Proline is a helix-breaker and is
underrepresented - Also, Cys, Gln, and Asn are rarely found.
9amino acid propensities conclusions
The overall amino acid composition deviates
significantly from that of the whole
genome. Hydrophobic residues (A, F, G, I, L, M,
V, W) occur more frequently in MPs than in the
whole genomes. Conversely, residues C, D, E, K,
N, P, Q, R are underrepresented in MPs. H, S, T,
and Y have equal distributions in MPs and whole
genomes.
9
10Beuming Weinstein (2004) inside vs. outside
- Most of the exposed (lipid facing) charged
residues (D, E, K, H, R) that are found in TMs
are located in the terminal regions (4.4) rather
than in the central region (2.7). - The exposed terminal parts are very rich in
aromatic residues (21.3) compard to the central
part (16.1).
11Beuming Weinstein (2004) surface propensity
scale
Table shows fraction SF of exposed residue i. Trp
has highest value of SF, His has smallest
value. Normalize SP values with respect to His
(SP0) and Trp (SP1).
12correlation of SP scale with other scales
Compute correlation coefficient. SP propensity
scale has high correlation with hydrophobicity or
volume scales. Combine SP scale with
conservation index
pa a priori distribution of residues
13Beuming Weinstein (2004)
Add propensity score and conservation
score total score(i) SPi CIi Accuracy to
detect the buried resides is ca. 70.
14Beuming Weinstein (2004)
(top) correct SASA in X-ray structure (middle)
prediction based on amino-acid propensity
conservation BEST! (bottom) prediction based
only on conservation
15Adamian Liang (2006) interacting helices
Example for two interacting TM helices in
succinate dehydrogenase. Interacting residues
follow heptad motiv. Note the periodicity of 3.6
residues per turn in an ideal ?-helix.
16Adamian Liang (2006)
Heptad motifs are generally preferred for
interacting helix pairs. For left-handed
helices, about 94.7 and 92.4 of interacting
residues can be mapped to heptad repeats for
parallel and anti-parallel helices. For
right-handed pairs the number are slightly
less. Assume that the residues of
lipid-accessible helices follows a similar
pattern.
17Adamian Liang (2006)
Each TM helix has 7 faces. A the anchoring
residues are 0, 7, 14, and 21 contacts are also
formed by residues 3, 4, 10, 11, 17, 18
18Adamian Liang (2006)
Combine lipophilicity score Lf and positional
entropy Ef of a helical face by simply
multiplying them.
19Adamian Liang (2006) Test fo TRP channel
20Adamian Liang (2006) discuss failures
Sometimes, binding sites for individual lipids
(e.g. cardiolipin) are formed on the surfaces of
TM proteins. Those residues will also be highly
conserved, and the method will therefore fail.
21What is needed for true de novo design of helical
bundles?
Aim explore new TM protein topologies. distance-
dependent residue-residue force field Generate
energetically favorable geometries of helix
dimers. Overlap helix dimers ? full protein
structure.
22Derivation of position scores
- For each test protein, ? 1000 similar sequences
- from non-redundant database using BLAST URLAPI.
- (2) generate initial multiple sequence alignment
(MSA) with ClustalW. - Delete fragments lt 80 of length of query
sequence. - From these refined MSA, apply 6 different
identity criteria, - 6 final MSAs for each test protein.
- Pei Grishin need to align 20 sequences to
accurately estimate conservation indices from
MSAs.
23predicting the TM-helix-orientation from sequences
Assumption lipid-exposed positions are less
conserved.
CI conservation index in MSA SASA Solvent
accessible surface area, relative to a single,
free helix
fj(i) frequency of amino acid j in position
i. fj frequency of amino acid j in full
alignment. C average conservation index
(CI) ? Standard deviation Positive values
conserved positions Negative values variable
positions
Test correct orientation (0,0) has lowest score.
24Ab initio structure prediction of TM bundles
- Aim construct structural model for a bundle of
ideal transmembrane helices. - Construct 12 good geometries for every helix pair
AB, BC, CD, DE, EF, FG - overlay ABCDEFG
- thin out solution space containing ca. 126
models - (a) remove solutions where helices collide
with eachother - (b) delete non-compact solutions
- score remaining 106 solutions by sequence
conservation - cluster 500 best solutions in 8 models
- rigid-body refinement, select 5 models with best
sequence conservation.
25Rigid-body refinement
26Compare best models with X-ray structures
dark Model light X-ray structure Additional
input known connectivity of the helices
A-B-C-D-E-F-G. Otherwise, the search space
would have been too large.
Halorhodopsin
Bacteriorhodopsin
Sensory Rhodopsin
Rhodopsin
NtpK
27Comparing the best models with X-ray structures
28Can one select the best model?
- These are our 4 best
- non-native models of bR.
- Because contact between A and E was not imposed,
very different topologies were obtained. - In 2006, our methods could not distinguish
between these models. - but they could serve as input for further
experiments.
29Success case True de novo model of 4-helix
bundle
30Predicting lipid-exposure
31Predicting lipid-exposure
Aim derive optimal scale to predict exposure of
residues to hydrophobic part of lipid
bilayer. Scale should optimally correlate with
SASA ? minimize quadratical error.
Solution for minimization task
?
Y SASA values of the training set (N 2901
residue positions) X profile of residue
frequencies from multiple sequence alignment ( N
? 21 matrix) ? wanted propensity scale for 20
amino acids 1 intercept value (21)
32What does MO scale capture?
33Improved prediction of exposure by statistical
learning
Beuming Weinstein (2004) method
Prediction method Prediction accuracy
Beuming Weinstein 68.7
TMX 78.7
Yuan ... Teasdale 71.1
34Improved method by statistical learning
The theory of Support Vector Classifiers evolves
from a simpler case of optimal separating
hyperplanes that, while separating two separable
classes, maximize the distance between a
separating hyperplane and the closest point from
either class.
A The two classes can be fully separable by a
hyperplane, and the optimal separating hyperplane
can be obtained by solving Eq. 9. B It is not
possible to separate the two classes with a
hyperplane, and the optimal hyperplane can be
obtained by solving Eq. 17.
35Improved method by statistical learning
Membrane Bioinformatics SS09
36Improved method by statistical learning
37Improved method by statistical learning
38Improved method by statistical learning
39http//service.bioinformatik.uni-saarland.de/tmx/
input Putative TM helices TopoView draws Snake
plot Master thesis Nadine Schneider
40http//service.bioinformatik.uni-saarland.de/tmx/
Top TMD11, Bottom TMD 12
41http//service.bioinformatik.uni-saarland.de/tmx/
Top TMD5, Bottom TMD 12
42Summary TMX and related methods
- Sequences of TM proteins reveal many powerful
features to allow prediction of 2D- and 3D
structural features, function, and
oligomerization status. - TMX server can predict lipid exposure with ca.
78 accuracy. - http//service.bioinformatik.uni-saarland.de/tmx/
- Possible applications
- predict transporter pores
- (2) predict lipid-exposed surface of TM proteins
- correlate with different membrane composition
- collaborate with us ? do you have lots of
solubility data? - (3) Conserved surface residues may indicate
interaction sites