Title: Optimization Energy Landscapes Protein Folding
1OptimizationEnergy LandscapesProtein Folding
Course will be introduce mathematical/theoretical
concepts and demonstrate their relevance to
practical biological problems Pre-requisite
knowledge of Computational Chemistry 1
lecture Course tries to minimize overlap with
Computational Chemistry 2 lecture
2Content
1 Introduction Biomolecular systems Proteins,
membranes, phenomena of protein folding, protein
complexes 2 Protein folding on lattices Review
of statistical thermodynamics (deltaG, deltaS)
Exact enumeration of all states Folding via
Monte-Carlo algorithm, which moves? Folding
funnel Roughness of the energy landscape 3
Protein folding on lattices (II) HPCC
Algorithmus à la Ken Dill, work by Rolf
Backofen 4 Calculation of energies in
biomolecular systems (do we need this?)
Molecular force fields, solvent
effect Replace by lecture on membrane protein
structure and folding? 5 Off lattice protein
folding simulations involving all atom
simulations MD simulations characterization of
the free energy landscape for folding Replica
exchange simulations Restraints to generate
partially unfolded states
3Content (II)
6 Calculation of chemical rates Transition
state theory Kramer theory Folding at home 7
Diffusion Smoluchowski equation Langevin
equation ? Ermak-McCammon-algorithm for brownian
dynamics 8 Application Association kinetics
of protein A with protein B Energy landscape for
6 degrees of freedom (3 translation, 3
rotation) Computation of kon rates from
Brownian dynamics simulations Calculation of
entropies from trajectory analysis Compare
boltzmann-weighted energies for protein B on
lattice with protein A 9 Protein Assemblies
10 Electron transfer (Marcus theory), proton
transfer 11 Photo physics of photoactive
molecules Conformational dynamics on electronic
surfaces, conical intersections
4Literature
lecture slides will be available 0-2 days prior
to lecture ? suggested reading links will be
put up on course website http//gepard.bioinformat
ik.uni-saarland.de/teaching...
5Schein successful written exam
The successful participation in the lecture
course (Schein) will be certified upon
successful completion of an oral exam in
February/March 2006. Participation at the oral
exam is open to those students who have mastered
the 3 - 4 assignments.
6literature
7My systems of interest
Proteins - folding landscape - membrane
proteins recent progress on folding of membrane
proteins! Protein assemblies - molecular
machines (stable complexes) - transient
complexes Membranes - formation -
dynamics Protein membrane association Partitioni
ng of proteins in membranes
8Das Rätsel der Proteinfaltung
I Was ist das Problem? Levinthalsches
Paradoxon II Lösung Energielandschaft hat die
Form eines Faltungstrichters Studium der
Energielandschaft mit Gittersimulationen III
gegenwärtiges Neuland ungefaltete
Proteinabschnitte Proteinmissfaltung im
Prion-Protein
9Levinthal-Paradoxon
Für ein Protein mit 100 AS und jeweils 2
Konformationen für jede Aminosäure ergeben sich
2100 1.27 x 1030 mögliche Konformationen des
Proteins. Wenn das Protein 10-13 sec brauchen
würde, jede einzelne Konformation abzusuchen, zu
samplen, dann würde es 10-13 x 1.27x1030
1.27 x 1017 s 4 x 109 Jahre brauchen bis es
alle seine Konformationen abgesucht hätte und
eventuell die energetisch günstigste gefunden
hätte. Dies ist offensichtlich nicht
möglich. Daher muß es Faltungshilfen oder
spezielle Faltungspfade geben, so dass das
Protein nicht alle theoretisch mögliche Zustände
absuchen braucht.
10Faltungspfade
- Es gibt mehrere Hypothesen für die driving forces
der Proteinfaltung - hydrophober Kollaps die entfaltete
Proteinsequenz kollabiert in einen kompakten
Klumpen. Anschließend falten sich die
Sekundärstrukturelemente und bilden sich die
richtigen/optimalen dreidimensionalen Kontakte um
eines der zulässigen Faltungsmuster (folds)
anzunehmen. - ODER
- die Sekundärstrukturelemente falten sich zunächst
selbständig (framework model) und lagern sich
anschließend zusammen. - Für beide Faltungsszenarien gibt es
experimentelle Beispiele. - Oft liegt die Wahrheit in der Mitte.
11New view of protein folding Faltung entlang
trichterähnlichen Energielandschaften
Bryngelson, Wolynes, PNAS (1987) Gradient ?
Rauhigkeit beschleunigt bremst Faltung Faltung
Frustration
Brooks, Gruebele, Onuchic, Wolynes, PNAS 95,
11037 (1998)
12Energielandschaften (H. Frauenfelder/UIUC)
Links ein sehr einfache und rechts eine sehr
komplizierte Energielandschaft links,
Energielandschaft von Ammoniak, NH3. Die
konformationelle Koordinate (x-Achse) beschreibt
den Abstand des Stickstoffatoms von der Ebene der
3 Wasserstoffatome. rechts , Eine stark
vereinfachte Energielandschaft eines Proteins. In
Wirklichkeit ist die Energielandschaft eine
Funktion von 3N Koordinaten, wobei N (die Anzahl
der Atome des Proteins) sehr groß ist.
Frauenfelder Leeson, Nature Structural Biology
5, 757 - 759 (1998)
13Moleculare Chaperone Proteine, die anderen
globulären Proteinen helfen, ihre korrekte
Faltung einzunehmen molekulares Rotes Kreuz
- Molekulare Chaperene wie hsp60 oder GroEL (rechts
gezeigt) - sind eine Klasse von Proteinen, die in der Zelle
anderen Proteinen helfen, ihre korrekte Faltung
einzunehmen - Dazu können molekulare Chaperone sehr effektiv an
nach außen gewandte hydrophobe Regionen von
teilweise gefalteten Strukturen binden. - In die Jacke helfen.
14Fold Optimierung
- Einfache Gittermodelle (HP-Modelle)
- Zwei Sorten von Seitenketten
- hydrophob und polar
- 2-D oder 3-D Gitter
- Treibende Kräfte
- hydrophober Kollaps es ist günstig, Kontakte
zwischen hydropoben Seitenketten zu bilden - Bewertung Anzahl an H?H Kontakten
15HP-Gittermodelle
Ken Dill 1997
Vorteil solch einfacher Modelle man kann den
Konformationsraum systematisch absuchen.
16The importance of being unfolded?
Anscheinend sind nicht wenige Proteine der Zelle
einen Großteil der Zeit teilweise entfaltet (P.E.
Wright, H.J. Dyson, J. Mol. Biol. 293, 321
(1999)) Dies klingt sehr unerwartet. Was wären
mögliche biologische Vorteile davon? (1)
Entfaltete Proteine können schneller abgebaut
werden ? kann für Regulation eines schnellen
Zellzyklus erforderlich sein. (2) Molekulare
Erkennung ist schneller, wenn Faltung und Bindung
gekoppelt sind (3) Loopstrukturen können viele
biologische Targets erkennen ? wichtig für
Kommunikation und Regulierung bzw. Bildung großer
Komplexe? (4) Entfaltete Proteine können schnell
in andere Zellkompartments transportiert werden.
17NORS regions no regular secondary structure
NORS regions are defined to have at least 70
consecutive residues with less than 12 regular
secondary structure (helix or strand). Rost and
co-workers found 4 types of proteins. (A)
Connecting loops long loops that connect two
domains or chains (shown Formate Dehydrogenase H,
1AA6). (B) Loopy ends long N- or C-terminal
regions that lack regular secondary structure
(shown Hexon from adenovirus type 2, 1DHX). (C)
Loopy wraps long loopy regions wrapping around
globular domains (shown Class II chitinase, 2BAA.
(D) Loopy domains entire structures that have
almost no regular secondary structure (shown
extra-cellular domain of T beta RI, 1TBI).
Liu, Tan, Rost, J Mol Biol (2002) 332, 53-64
18Many NORS regions predicted in proteomes
Liu et al. predicted many NORS regions in 31
entirely sequenced organisms. NORS proteins
appeared particularly abundant in eukaryotes.
(A) gives the percentage of proteins in
respective proteome for which at least one NORS
region is predicted. High enrichment in
eukaryotic proteomes! (B) illustrates the
percentage of all the residues of the respective
proteome for which a NORS region is predicted.
(C) gives the percentage of all predicted NORS
regions that are between N and N10 residues long
(note that, by definition, NORS regions are
longer than 70 residues). Surprisingly, almost
15 of all the predicted NORS regions extend over
more than 200 residues (inset of C).
Liu, Tan, Rost, J Mol Biol (2002) 332, 53-64
19NORS regions use particular amino acids
The height of the one-letter amino acid code is
proportional to the abundance of the respective
acid in each data set. The actual value is the
difference in occurrence with respect to the
frequency observed in a sequence-unique subset of
PDB . Inverted letters indicate acids that
are less frequent than 'expected'. The amino
acids are sorted by 'flexibility' , with the more
rigid ones on the left. Overall, NORS regions are
as abundant in more flexible residues as loop
regions in PDB . However, we found considerably
more Serine (S), Glutamine (Q), and Glycine (G)
and considerably fewer Arginine (R), Aspartic
acid (D), Glutamic acid (E), Tryptophan (W), and
Phenylalanine (F) in NORS regions than in loop
regions, in general.
Liu, Tan, Rost, J Mol Biol (2002) 332, 53-64
20Prion ein ungeklärtes Beispiel für misgefaltete
Proteine
- Das Prion-Protein PrPc
- ist ein normales zelluläres Glycoprotein
- ist an die Plasmamembran über einen
- GPI-Anker angehängt
- hat 209 Aminosäuren
- Seine genaue Funktion ist unbekannt.
- Cu2 Speicherung, Erinnerung?
- Struktur aus NMR-Bestimmungen bekannt
- Die N-terminale Region 23-120 ist sehr
- flexibel und meist ungeordnet.
- C-terminale Region enthält 3 ?-Helices,
- 2 kurze ?-Stränge
- PrPc wird schnell durch Proteinase K abgebaut
21Die mit Krankheit assoziierte Form PrPsc
PrPsc oligomerische ?-reiche Struktur teilweise
Resistenz gegenüber Verdau durch Proteinase K
starke Tendenz, in unlösliche Plaques zu
aggregieren die 3D-Struktur von PrPsc ist nicht
bekannt! Nur-Protein Hypothese (Prusiner
1980s und 1990s) der Umfaltungsprozeß PrPc ?
PrPsc wird durch PrP Protein autokatalysiert
Stanley Prusiner, Nobelpreis für Physiologie oder
Medizin 1998
22Modelle für die Bildung von PrP-res aus PrPc
Caughey Trends Biochem Sci 26, 235 (2001)
23Modelle, die auf Polymerisation beruhen
Caughey Trends Biochem Sci 26, 235 (2001)
24Gegenwärtiges Verständnis von Prionen
Die molekularen Mechanismen für die Umordnung von
PrPc nach PrPsc sind immer noch
unklar. Theoretische Methoden konnten (leider ? )
noch nicht viel beitragen. Der Ãœbergang PrPc ?
PrPsc ist ein kooperatives Phänomen. Daher kann
man es wohl nicht durch die Untersuchung von PrP
Monomeren verstehen. Das Seed-Modell scheint
plausibel. Der Übergang nach PrPsc könnte über
ein Faltungsintermediat I gehen. Dies würde
erklären, warum Mutanten anfällig für Krankheiten
sind, bei denen diese Faltungsintermediate
stärker besetzt ist bzw. bei denen der
Grundzustand (F) weniger stabil gegenüber I ist
als bei Gesunden.
25Fluid-Mosaic-Model of the cell membrane
Like a mosaic, the cell membrane is a complex
structure made up of many different parts, such
as proteins, phospholipids and cholesterol. The
relative amounts of these components vary from
membrane to membrane, and the types of lipids in
membranes can also vary. The membrane structure
is highly dynamic. Its viscosity is only about
100 times larger than that of water.
http//www.nature.com/horizon/livingfrontier/backg
round/membrane.html
26Membrane bilayers
Edidin, Nature Reviews Cell Biol 4, 414 (2003)
27Membrane bilayers
Membranes are not structureless. Domains or
lipid rafts rich in cholesterol and
sphingo-lipids may form transiently.
Edidin, Nature Reviews Cell Biol 4, 414 (2003)
28How do helical membrane proteins fold?
White, FEBS Lett. 555, 116 (2003)
29Hydrophobicity Scales
White, FEBS Lett. 555, 116 (2003)
30Translocon-assisted folding of TM proteins?
White von Heijne, Curr Opin Struct Biol 14, 397
(2004)
31Translocon
crystal structure of translocon in closed state.
White von Heijne, Curr Opin Struct Biol 14, 397
(2004)
32Types of TM-proteins
orientation of C- and N-terminus depends on
charge. Cytoplasm contains more negatively
charged lipids. By mutating the charges one can
invert topology.
White von Heijne, Curr Opin Struct Biol 14, 397
(2004)
33Folding paradigm
Back to the folding models of soluble
proteins (hydrophobic collapse vs. framework
model). Obviously, hydrophobic collapse doesnt
apply here. Using FRET labels (fluorescent
non-natural amino acids) it could be shown
that the newly synthesized peptide assumes a
compact partially folded structure.
White von Heijne, Curr Opin Struct Biol 14, 397
(2004)
34Insertion of TM helices into bilayer
This is an ingenious experiment to identify the
code for TM helix partioning into the bilayer.
Two glycolization sites engineered around H. If H
is inserted in membrane only G1 is glycosilated,
otherwise G1 and G2.
Hessa et al , Nature 433, 377 (2005)
35Hydrophobicity scales
Results from this work correlate well with
partitioning of peptides between water and
octanol (Fig c) ? partioning of TM helices into
membrane is determined by standard
physico-chemical principles.
Hessa et al , Nature 433, 377 (2005)
36Open and closed complexes
distinguish between two different types of
supra-molecular complexes Closed complexes are
relatively stable assemblies of different
molecules with a fixed stoichiometry, resulting
in large molecular machines like ribosomes,
polymerases and ATPases. Although these complexes
may be dynamic due to their respective function
(like capturing and releasing elongation factors
for ribosomes or transient phosphorylation for
allosteric proteins), they have a well defined
structure and are degraded only as a whole
(typically by proteasomes after ubiquitylation).
In contrast, open complexes are in a constant
exchange of their molecular components with the
environment. Both the total number of components
and their relative stoichiometry can vary within
a certain range. A typical example are the
cytoplasmic plaques of focal adhesions, which
have typical lifetimes of minutes to hours, while
the turnover time for the single proteins
building up the plaque is on the order of
seconds. In contrast to closed complexes, open
complexes are not assembled and degraded as a
whole, but in a gradual way.
37Focal adhesion points
- Focal adhesions are the most prominent sites
- of adhesion when cell-matrix adhesion is studied
- on rigid surfaces (glass or plastic).
- In a physiological (soft) environment, similar
sites of - adhesion exists, although they tend to be smaller
and - of somehow different molecular composition.
- Focal adhesions consist of four layers
- (see Fig. from bottom to top)
- an external layer of ECM ligand,
- a layer of transmembrane receptors from
- the integrin family,
- a cytoplasmic plaque consisting of more than
- 50 different proteins, and
- a layer of actin connecting the focal adhesion
to the cytoskeleton. - Focal adhesions strongly signal to the
cytoskeleton, mainly through the small GTPases
from the Rho family. They also trigger other
signalling pathways like the MAP kinase pathway,
thus influencing gene expression and cell fate. - Focal adhesions are also the main sites for force
transmission between the extracellular
environment and the cell. They seem to function
as mechanosensors which convert both internal and
external force into protein aggregation and
signalling. In particular, cells might sense the
mechanical properties of their environment by
actively pulling on it through actomyosin
contractility and focal adhesions.
38Virus assembly idealized examples of closed
complexes
- Schematic representation of a TÂ Â 3Â quasiequivalen
t lattice, corresponding to a rhombic
triacontahedron, the geometrical architecture of
black beetle virus (BBV). Each of the trapezoids
represents a single subunit with the same amino
acid sequence. The TÂ Â 3Â particle is formed of
180Â subunits that lie in three structurally
unique positions (labeled A, B, and C). Subunits
labeled with same letter are related by
icosahedral symmetry axes corresponding to
twofold, threefold, and fivefold rotations
identified by white ovals, triangles, and
pentagons, respectively. Subunits marked with
different letters are related to one another by
quasisymmetry axes corresponding to twofold and
threefold local rotation axes identified,
respectively, by yellow ovals and triangles. The
subunits labeled A, B, and C are related by
quasi-threefold symmetry they form an
icosahedral asymmetrical unit (protomer) of the
TÂ Â 3Â particle. - pseudo TÂ Â 3Â surface lattice. In this lattice
there are three types of trapezoids (VP1, VP2,
and VP3) representing subunits with different
amino acid sequences. The subunits identified by
the same label are related by icosahedral
symmetry elements, twofold, threefold, and
fivefold, identified by white ovals, triangles,
and pentagons. - black beetle virus (BBV). blue, red, green A,
B, and C subunits. The average diameter of the
particle is 312Â Ã…. Icosahedral and quasisymmetry
elements are identified by white and yellow
labels.
(d) icosahedral asymmetrical unit (protomer) of
BBV made up of the A, B, and C subunits and a
strand of partially ordered RNA of 10Â bases.
Reddy et al. , Biophys J 74, 546 (1998)
39Virus assembly compute energies of intermediates
(a) A table showing the top three preferred
configurations for each association of subunits
in the computed assembly pathway for BBV, with
the monomer as the assembling unit. The first
column shows the number of associating monomers.
Columns 2, 4, and 6 show a schematic of the three
best structures for each association. ??G12 and
??G23 refer to the negative differences of the
association energies of the first and second and
second and third configurations. (b) The
preferred structures, with the trimer as the
assembling unit. It is important to note that the
best configurations for both assembly pathways
are nearly always the same in some cases even
the second best is the same, emphasizing that the
trimer is the likely assembling unit. An
exception is the best structure of the 15mer
association. In this case the most stable monomer
assembly is not made up of a multiple of
protomers, but its preference, compared to the
second and third most stable structures, which
are made of protomers, is marginal.
Reddy et al. , Biophys J 74, 546 (1998)
40Summary
- Protein folding problem is well-isolated
problem, almost classical - some aspects are reasonably well understood
- interest currently widens towards studying
multi-protein assemblies, - superstructural units
- few concepts available, learn from protein
folding field? - many interesting phenomena involve membranes