Conformation Networks: an Application to Protein Folding - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Conformation Networks: an Application to Protein Folding

Description:

unique 3D structure (native physiological conditions) biological function ... Coming up: conditions on side chain distributions for the existence of funneled ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 45
Provided by: vwInd
Learn more at: http://vw.indiana.edu
Category:

less

Transcript and Presenter's Notes

Title: Conformation Networks: an Application to Protein Folding


1
Conformation Networks an Application to Protein
Folding
Zoltán Toroczkai
Erzsébet Ravasz
Center for Nonlinear Studies
Gnana Gnanakaran (T-10)
Theoretical Biology and Biophysics
Los Alamos National Laboratory
2
Proteins
  • the most complex molecules in nature
  • globular or fibrous
  • basic functional units of a cell
  • chains of amino acids (50 103)
  • peptide bonds link the backbone

Native state
  • unique 3D structure (native physiological
    conditions)
  • biological function
  • fold in nanoseconds to minutes
  • about 1000 known 3D structures X-ray
    crystallography, NMR

3
Myoglobin
153 Residues, Mol. Weight17181 D, 1260 Atoms
Main function primary oxygen storage and carrier
in muscle tissue
It contains a heme (iron-containing porphyrin )
group in the center. C34H32N4O4FeHO
4
Protein conformations
  • defined by dihedral angles
  • 2 angles with 2-3 local minima of the torsion
    energy
  • N monomers ? about 10N different conformations

5
Levinthals paradox
  • Anfinsen thermodynamic hypothesis
  • native state is at the global minimum of the free
    energy

Epstain, Goldberger, Anfinsen, Cold Harbor
Symp. Quant. Biol. 28, 439 (1963)
  • Levinthals paradox, 1968
  • finding the native state by random sampling is
    not possible
  • 40 monomer polypeptide ? 1013 conf/s
  • ? 3? 1019 years to sample all
  • ? universe 2? 1010 years old

Levinthal, J. Chim. Phys. 65, 44-45 (1968)
Wetlaufer, P.N.A.S. 70, 691 (1973)
  • nucleation
  • folding pathways

6
Free energy landscapes
  • Bryngelson Wolynes, 1987
  • free energy landscape

Bryngelson Wolynes, P.N.A.S. 84, 7524 (1987)
  • a random hetero-polymer typically does NOT fold
  • Experiment
  • random sequences
  • GLU, ARG, LEU
  • 80-100 amino-acids
  • 95 did not fold
  • in a stable manner

Davidson Sauer, P.N.A.S. 91, 2146 (1994)
7
Funnels
  • Leopold, Mortal Onuchic, 1992

Leopold, Mortal Onuchic, P.N.A.S. 89, 8721
(1992)
Energy funnels
Difficult and slow
  • many folding pathways

8
Molecular dynamics
Sanbonmatsu, Joseph Tung, P.N.A.S. 102 15854
(2005)
  • State of the art
  • supercomputer (LANL)
  • Ribosome in explicit solvent
  • targeted MD
  • 2.64x106 atoms (2.5x105 water)
  • Q machine, 768 processors
  • 260 days of simulation (event 2 ns)

1016 times slower
  • distributed computing (Stanford, Folding_at_home)
  • more than 100,000 CPUs
  • simulation of complete folding event
  • BBA5, 23-residue, implicit water
  • 10,000 CPU days/folding event (1?s)

Shirts Pande, Science 290, 1903 (2000) Snow,
Nguyen, Pande, Gruebele, Nature 420,102 (2002)
9
Configuration networks
  • Configuration networks
  • Protein conformations
  • dihedral angles have few preferred values

Ramachandran Sasisekharan, J.Mol.Biol. 7, 95
(1963)
  • Helix
  • Sheet
  • other

NODE ? configuration
LINK ? change of one degree of freedom (angle)
  • refinement of angle values ? continuous case

10
Why networks?
  • VERY LARGE 100 monomers ? 10100 nodes. However

Generic features of folding are determined by
STATISTICAL properties of the configuration
network
  • degree distribution
  • average distance
  • clustering
  • degree correlations
  • toolkit from network research
  • captures the high dimensionality

Albert Barabási, Rev. Mod. Phys. 74, 67 (2002)
Newman, SIAM Rev. 45, 167 (2003)
  • faster algorithms to simulate folding events
  • pre-screening synthetic proteins
  • insights into misfolding

11
A real example
  • The Protein Folding Network F. Rao, A.
    Caflisch, J.Mol.Biol, 342, 299 (2004)
  • beta3s 20 monomers, antiparallel beta sheets
  • MD simulation, implicit water
  • 330K, equilibrium folded ? random coil

NODE -- 8 letters / AA (local secondary
struct) LINK -- 2ps transition
12
Its native conformation has been studied by NMR
experiments
De Alba et.al. Prot.Sci. 8, 854 (1999).
Beta3s in aqueous solution forms a monomeric
triple-stranded antiparallel beta sheet in
equilibrium with the denaturated state.
  • Simulations _at_ 330K
  • The average folding time from denaturated state
    83ns
  • The average unfolding time 83ns
  • Simulation time 12.6?s
  • Coordinates saved at every 20ps (5?105 snapshots
    in 10?s)
  • Secondary structures H,G,I,E,B,T,S,- (?-helix,
    310 helix, ?-helix, extended, isolated ?-bridge,
    hydrogen-bonded turn, bend and unstructured).
  • The native state -EEEESSEEEEEESSEEEE-
  • There are approx. 818 ?1016 conformations.
  • Nodes conformations, transitions links.

13
Scale-free network
Barabási Albert, Science 286, 509, (1999)
Many reasons behind SF topology
  • Why is the protein network scale free?
  • Why does the randomized chain have
  • similar degree distribution?
  • Why is ? - 2 ?

14
Robot arm networks
  • Steric constraints?
  • missing nodes
  • missing links
  • n-dimensional hypercube
  • binomial degree distribution

Homogeneous
Swiss cheese
15
A bead-chain model
  • Beads on a chain in 3D robot arm model
  • similar to C? protein models
  • rod-rod angle ?
  • 3 positions around axis

Honeycutt Thirumalai, Biopolymers 32, 695 (1992)
N6 ? 90
N18 ? 120 2212112212111122
  • Homogeneous network

16
Another example
L 7, ? 75? , r 0.25
00100

state 00100
allowed state
forbidden state
17
Adding monomers not only increases the number of
nodes in the network but also its
dimensionality!! The combined effect is
small-world.
18
Shortcuts in Folding Space
19
(No Transcript)
20
The dilemma
  • HOMOGENEOUS
  • from studies of conformation networks
  • bead chain
  • robot arm

?
21
Gradient Networks
Gradients of a scalar (temperature,
concentration, potential, etc.) induce flows
(heat, particles, currents, etc.).
Naturally, gradients will induce flows on
networks as well.
Ex.
Load balancing in parallel computation and packet
routing on the internet
Y. Rabani, A. Sinclair and R. Wanka, Proc. 39th
Symp. On Foundations of Computer Science (FOCS),
1998 Local Divergence of Markov Chains and the
Analysis of Iterative Load-balancing Schemes
References
Z. T. and K.E. Bassler, Jamming is Limited in
Scale-free Networks, Nature, 428, 716 (2004)
Z. T., B. Kozma, K.E. Bassler, N.W. Hengartner
and G. Korniss Gradient Networks,
http//www.arxiv.org/cond-mat/0408262
22
Setup
Let GG(V,E) be an undirected graph, which we
call the substrate network.
The vertex set
The edge set
A simple representation of E is via the Nx N
adjacency (or incidence) matrix
A
(1)
Let us consider a scalar field
Set of nearest neighbor nodes on G of i
23
Definition 1
The gradient ?h(i) of the field h in node i is
a directed edge
(2)
Which points from i to that nearest neighbor
for G for which the increase in the
scalar is the largest, i.e.,
(3)
The weight associated with edge (i,?) is given by
The self-loop
.
.
is a loop through i
with zero weight.
Definition 2
The set F of directed gradient edges on G
together with the vertex set V forms the gradient
network
If (3) admits more than one solution, than the
gradient in i is degenerate.
24
In the following we will only consider scalar
fields with non-degenerate gradients. This means
Theorem 1
Non-degenerate gradient networks form forests.
Proof
25
Theorem 2
The number of trees in this forest number of
local maxima of h on G.
26
For Erdos - Rényi random graph substrates with
i.i.d random numbers as scalars, the in-degree
distribution is
27
(No Transcript)
28
The Configuration model
A. Clauset, C. Moore, Z.T., E. Lopez, to be
published.
29
Generating functions
K-th Power of a Ring
30
(No Transcript)
31
Power law with exponent - 3
2Kl
32
(No Transcript)
33
The energy landscape
  • Energy associated with each node (configuration)
  • the gradient network
  • most favorable transitions
  • T0 backbone of the flow
  • MD simulation
  • tracks the flow network
  • biased walk close to the gradient network
  • trees
  • basins of local minima

What generates ? - 2 ?
The REM generates an exponent of -1.
34
Model ingredients
  • A network model of configuration spaces
  • network topology
  • homogeneous
  • degree correlations
  • how to associate energies

35
Random geometric graph
  • random geometric graph

Dall Christensen, Phys.Rev.E 66, 026121 (2002)
  • in higher D similar to hypercube with holes
  • degree correlations

36
N30000, ltkgt 1000, d2.
37
Exponent is - 2
2 essential ingredients
  • k1-k2 correlations
  • ltEgt with k monotonic

38
Bead-chain model
  • more realistic model bead-chain
  • configuration network
  • excluded volume
  • energy Lennard-Jones

39
L 30, ? 75?
40
(No Transcript)
41
The case of the ?-helix
AKA peptide
  • ALA orange
  • LYS blue
  • TYR green

MD simulations, no water.
42
The MD traced network
T 400
More than one simulation
3 different runs yellow, red and green
The role of temperature
43
(No Transcript)
44
Conclusions
  • A network approach was introduced to study
    sterically constrained conformations of
    ball-chain like objects.
  • This networks approach is based on the
    statistical dogma stating that generic features
    must be the result of statistical properties of
    the networks and should not depend on details.
  • Protein conformation dynamics happens in high
    dimensional spaces that are not adequately
    described by simplistic reaction coordinates.
  • The dynamics performs a locally biased sampling
    of the full conformational network. For low
    enough temperatures the sampled network is a
    gradient graph which is typically a scale-free
    structure.
  • The -2 degree exponent appears at and bellow the
    temperature where the basins of the local energy
    minima become kinetically disconnected.
  • Understanding the protein folding network has
    the potential of leading to faster simulation
    algorithms towards closing the gap between
    natures speed and ours.

Coming up conditions on side chain distributions
for the existence of funneled energy landscapes.
Write a Comment
User Comments (0)
About PowerShow.com