Title: Multi-Scale Hierarchical Structure Prediction of Helical Transmembrane Proteins
1Multi-Scale Hierarchical Structure Prediction of
Helical Transmembrane Proteins
Zhong Chen and Ying Xu Department of Biochemistry
and Molecular Biology and Institute of
Bioinformatics University of Georgia
2Outline
- Background information
- Statistical analysis of known membrane protein
structures - Structure prediction at residual level
- Helix packing at atomistic level
- Linking predictions at residue and atomistic
levels
3Membrane Proteins
- Roles in biological process
- Receptors
- Channels, gates and pumps
- Electric/chemical potential
- Energy transduction
- gt 50 new drug targets are membrane proteins
(MP).
4Membrane Proteins
- 20-30 of the genes in a genome encode MPs.
- lt 1 of the structures in the Protein Data Bank
(PDB) are MPs - difficulties in
experimental structure determination. -
5Membrane Proteins
- Prediction for transmembrane (TM) segments
(a-helix or Ăź-sheet) based on sequence alone is
very accurate (up to 95)
- Prediction of the tertiary structure of the TM
segments how do these a-helices/Ăź-sheets arrange
themselves in the constrains of bi-lipid layers?
Helical structures are relatively easier to solve
computationally
6Membrane Protein Structures
- Difficult to solve experimentally
- Computational techniques could possibly play a
significant role in solving MP structures,
particularly helical structures
7High Level Plan
- Statistical analysis of known structures
- Unveil the underlying principles for MP structure
and stability - Develop knowledge-based propensity scale and
energy functions. - Structure prediction at residue level
- Structure prediction at atomistic level MC, MD
- multi-scale, hierarchical computational framework
8Part I Statistical Analysis of Known Structures
9Database for Known MP Structures Helical Bundles
- Redundant database
- 50 pdb files
- 135 protein chains
- Non-redundant database (identity lt 30)
- 39 pdb files
- 95 protein chains (avg. length 220 AA)
10Bi-lipid Layer Chemistry
Polar header (glycerol, phosphate)
Hydrophobic tail (fatty acid)
11Statistics-based energy functions
- Length of bi-lipid layer 60 Ă…
- Central regions
- Terminal regions
- Three energy terms
- Lipid-facing potential
- Residue-depth potential
- Inter-helical interaction potential
Terminal
60 Ă…
30 Ă…
Central
Terminal
12Lipid-facing Propensity Scale
Residue Termini Central
ILE 0.84 1.33
VAL 0.71 1.30
LEU 0.89 1.30
PHE 1.03 1.38
CYS 0.37 0.67
MET 0.57 0.80
ALA 0.69 0.79
GLY 0.84 0.44
THR 0.79 0.61
SER 1.04 0.51
TRP 1.11 1.89
TYR 0.73 1.04
PRO 1.01 0.60
HIS 1.27 1.61
ASP 1.56 1.08
GLU 2.10 0.93
ASN 1.02 0.71
GLN 1.44 0.71
LYS 2.59 1.97
ARG 1.42 1.16
fraction of AA are
lipid-facing LF_scale(AA)
fraction of AA are in interior
- The most hydrophobic residues (ILE, VAL, LEU)
prefer the surface of MPs in the central region,
while prefer interior position in the terminal
regions - Small residues (GLY, ALA, CYS, THR) tend to be
buried in the helix bundle - Bulky residues (LYS, ARG, TRP, HIS) are likely to
be found on the surface.
This propensity scale reflects both hydrophobic
interactions and helix packing
13Helical Wheel and Moment Analysis
The magnitude of each thin-vector is proportional
to the LF-propensity and overall lipid-facing
vector is the sum of all thin vectors,
Average Predication Error 41 degree
Lipid facing vector prediction state of the
art kPROT avg. error 41Âş Samatey
Scale 61Âş Hydrophobicity
scales 65 68Âş
14Reside-Depth Potential
- hydrophobic residues tend to be located in the
hydrocarbon core - hydrophilic residues tend to
be closer to terminal regions - aromatic
residues prefer the interface region.
15TM Helix Tilt Angle Prediction
major pVIII coat protein of the filamentous fd
bacteriophage (1MZT)
16Inter-Helical Pair-wise Potential
Ă…
17Statistical energy potentials (summary)
- Three residue-based statistic potentials were
derived from the database (a) lipid-facing
propensity, (b) residue depth potential, (c)
inter-helical pair-wise potential - The lipid-facing scale predicted the lipid-facing
direction for single helix with a uncertainty at
40Âş - The residue-depth potential was able to predict
the tilt angle for single helix with high
accuracy. - Need more data to make inter-helical pair-wise
potential more reliable
18Part II Structure Prediction at Residue Level
19Key Prediction Steps
- Structure prediction through optimizing our
statistical potential (weighted sum) - Idealized and rigid helical backbone
configurations - Monte Carlo moves translations, rotations,
rotation by helix axis - Wang-Landau sampling technique for MC simulation
- Principle component analysis.
20Wang-Landau Method for MC
Observation if a random walk is performed with
probability proportional to reciprocal of density
of states then a flat energy
histogram could be obtained.
The density of states is not known a priori.
In Wang-Landau, g(E) is initially set to 1 and
modified on the fly. Monte Carlo moves are
accepted with probability Each time when an
energy level E is visited, its density of states
is updated by a modification factor f gt1, i.e.,
21Wang-Landau Method for MC
- Advantages
- simple formulation and general applicability
- Entropy and free energy information derivable
from g(E) - Each energy state is visited with equal
probability, so energy barriers are overcome with
relative ease.
22Principal Component Analysis
- Purpose
- analyze the conformation variations during a
simulation, and - identify the most important conformational
degrees of freedom. - Covariance matrix
A large part of the systems fluctuations can
be described in terms of only a few PCA
eigenvectors.
23A Model System Glycophorin (GpA) Dimer
- GxxxG motif
- Ridges-into-grooves
22 residues, 189 atoms EITLIIFGVMAGVMAGVIGTILLISY
24Glycophorin (GpA) Dimer (1AFO)
A GEM (global energy minimum)
RMSD3.6A E-114.6kcal/mol
B LEM
RMSD0.8A E-93.9kcal/mol
B
A
RED experiment GREY simulation
25Helices A and B of Bacteriorhodopsin (1QHJ)
A
B
A GEM
RMSD2.7A E-94kcal/mol
B LEM
RMSD0.9A E-86kcal/mol
RED experiment GREY simulation
26Bacteriorhodopsin (1QHJ)
Rmsd5.0A
G
F
A
A
E
C
B
D
Computational prediction
Experimental structure
27Residue-level structure prediction (Summary)
- A computational scheme was established for TM
helix structure prediction at residue level - For two-helix systems, LEM structures very close
to native structures (RMSD lt 1.0 Ă…) were
consistently predicted - For a seven-helix bundle, a packing topology
within 5.0 Ă… of the crystal structure was
identified as one of the LEMs.
28Part III Structure Prediction at Atomistic Level
29Key Prediction Steps
- Structure prediction through optimizing
atom-level energy potential - CHARMM19 force field for helix-helix interaction
- Knowledge-based energy function for lipid-helix
interaction - Idealized and rigid helix structure for backbone
and sidechain flexible - Apply helix orientation constraint (i.e., N-term
inside/outside cell) - MC moves translations, rotations, rotation by
helix axis, and side-chain torsional rotation - Wang-Landau algorithm for MC simulation
30CHARMM19 Polar Hydrogen Force Field
- nonpolar hydrogen atoms are combined with heavy
atoms they are bound to , - polar hydrogen atoms
are modeled explicitly.
312D Wang-Landau Sampling in PC1 and E Spaces
LEM2
LEM1
32Effect of Helix-Lipid Interactions Helices AB
of Bacteriorhodopsin
Helix-helix interactions
Helix-helix helix-lipid interactions
Helix-lipid interactions play a critical role in
the correct packing of helices
33Effect of Helix-Lipid Interactions Helix AB of
Bacteriorhodopsin (BR)
Hydrocarbon core region
30 Ă…
All four LEM structures share essentially the
same contact surfaces. In the native structure,
the polar N-terminals of both helices are located
outside of hydrocarbon core region, resulting in
low helix-lipid energy.
34Docking of a Seven-helix Bundle
Bacteriorhodopsin (1QHJ)
Crystal structure
7 helices, 174 residues, 1619 atoms
A
- CHARMM19 lipid-helix potential
- One month CPU time on one PC
B
A
B
Initial Configuration
35Potential Energy Landscape
36Global Energy Minimum Structure (RMSD3.0 Ă…)
RED experiment GREY simulation
37Atom-level Structure Prediction (Summary)
- Wang-Landau algorithm proved to be effective for
the energetics study of TM helix packing - Prediction results for two-helix and seven-helix
structures are highly promising - Practical application of Wang-landau method to
large systems requires further work.
38Part IV Linking Predictions at Residue- and
Atomistic levels
39Correspondence between simulations at two levels
- A multi-scale hierarchical modeling approach is
feasible and practical - LEMs identified at residue-level be used as
candidates for atomistic simulation - Using PC vectors from residue-level simulation to
improve search speed in atomistic simulation.
40Future Works
- Further improvement of the residue-based folding
potentials - Speed-up and parallelization of Wang-Landau
sampling - Construct a hierarchical computational framework,
and develop corresponding software package.
41Acknowledgements
- Funding from NSF/DBI, NSF/ITR, NIH, and Georgia
Cancer Coalition - Dr. David Landau (Wang-Landau algorithm) and Dr.
Jim Prestegard (NMR data generation) of UGA - Thanks DIMACS for invitation to speak here