Issues in Macromolecular Simulation - PowerPoint PPT Presentation

1 / 84
About This Presentation
Title:

Issues in Macromolecular Simulation

Description:

* * * * * * * * * * * * * * * * * Anyone who has ever coiled a rope knows that you have to twist the ends to get the coil to stay flat. Depending on which way you ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 85
Provided by: chea3
Category:

less

Transcript and Presenter's Notes

Title: Issues in Macromolecular Simulation


1
(No Transcript)
2
Visualizing Protein Structuresand Structural
Bioinformatics
  • Stephen Sontum
  • Middlbury College
  • sontum_at_middlebury.edu

Chapter 1 6 Introduction to Bioinformatics Stru
ctural Bioinformatics and Drug DiscoveryArthur
M. Lesk
3
Visualizing Protein Structures What we hope to
learn
  • Hierarchy of Protein structure
  • Primary Structure
  • Secondary structure
  • Motifs or supersecondary structure
  • Tertiary Structure (protein domains)
  • Quaternary Structure
  • Forces Driving Protein Folding
  • Finding out more about structures
  • How to visualize molecules with VMD(HW)
  • How to edit Protein Data Bank files(HW)

4
Bioinformatics Spectrum
  • Informatics Models
  • Classification
  • Patterns
  • Relationships
  • Physical Models
  • Structure
  • Function
  • Mechanism

5
Breadth Evolution
Evolutionary relationships are essential for
making sense of biological data. The study of
evolutionary patterns must begin with the
assembly of a set of homologues.
Homologydecent from common ancestorSimilarityqu
antitative measure of difference
Human hand Human hemoglobin Human eye Human ear
bones
dogs forepaw dog hemoglobin Eye of an insect jaw
of a fish
Protein structure changes more conservatively
than amino acid sequence. Groups of related
proteins are called families Sequence databases
InterPro, Pfam, Procite, and COGStructure
databases Scop, Cath, and CDD
6
Protein Folding Problem
H3N-A1-A2-A3-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A98-A9
9-A100-COO-
Butane
Protein
  • 3 angles 3 conformations 27 conformations/pepti
    de bond
  • 27100 10143 conformations !!!!
  • One conformation every femtosecond (10-15 sec)
  • 10120 years to fold a protein

7
Anfinsens Hypothesis
H3N-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-A-----A1
00-COO-
Christian Anfinsen RNase A (1957) Nobel Prize
(1973)
Entropy
100 Active
G
Thermodynamic Hypothesis DGmin DG DH T DS DH
DHprotein DHwater DS DSprotein DSwater
5 To 15 Kcal/mol
8
(No Transcript)
9
VMD 3bp1Guanine Reductase
Extensions/Analysis/Sequence Viewer
File/New Molecule
http//kb.psi-structuralgenomics.org/ Protein
Structure Initiative
10
VMD 3bp1Guanine Reductase
Graphics Representations
Display Window
11
VMD 3bp1Guanine Reductase
VMD Main Window
Display Window
Command Window
12
VMD 3bp1Guanine Reductase
Extensions/Volume Plot
Extensions/Ramachandran plot
13
(from Mount)
primary (1º) secondary (2º) tertiary
(3º) quaternary (4º)
From Brandon Tooze, Introduction to Protein
Structure
14
amino acids/proteins Glycine GLY G Alanine
ALA A Phenylalanine PHE F Leucine
LEU L Isoleucine ILE I Valine
VAL V Proline PRO P Methionine MET M Glutamic
acid GLU E Aspartic acid ASP D Glutamine GLN Q As
paragine ASN N Lysine LYS K Arginine ARG R Serin
e SER S Threonine THR T Tyrosine TYR Y Tryptopha
n TRP W Histidine HIS H Cysteine CYS C
From Brandon Tooze, Introduction to Protein
Structure
15
amino acids/proteins Glycine GLY G Alanine
ALA A Phenylalanine PHE F Leucine
LEU L Isoleucine ILE I Valine
VAL V Proline PRO P Methionine MET M Glutamic
acid GLU E Aspartic acid ASP D Glutamine GLN Q As
paragine ASN N Lysine LYS K Arginine ARG R Serin
e SER S Threonine THR T Tyrosine TYR Y Tryptopha
n TRP W Histidine HIS H Cysteine CYS C
Hydrophobic amino acids
16
Hydrophobic amino acids
Alanine ALA A Phenylalanine PHE F Leucine
LEU L Isoleucine ILE I Valine
VAL V Proline PRO P Methionine MET M
Phenylalanine
Alanine
Leucine
Isoleucine
Proline
Methionine
Valine
17
Hydrophobic amino acids
Alanine ALA A Phenylalanine PHE F Leucine
LEU L Isoleucine ILE I Valine
VAL V Proline PRO P Methionine MET M
Phenylalanine
Alanine
Leucine
Isoleucine
Proline
Methionine
Helix breaker (except at N-caps)
Valine
18
amino acids/proteins Glycine GLY G Alanine
ALA A Phenylalanine PHE F Leucine
LEU L Isoleucine ILE I Valine
VAL V Proline PRO P Methionine MET M Glutamic
acid GLU E Aspartic acid ASP D Glutamine GLN Q As
paragine ASN N Lysine LYS K Arginine ARG R Serin
e SER S Threonine THR T Tyrosine TYR Y Tryptopha
n TRP W Histidine HIS H Cysteine CYS C
Charged amino acids
?
Charged residues tend to reside on protein surface
19
Charged amino acids Glutamic acid
GLU E Aspartic acid ASP D Lysine LYS K Arginine
ARG R Histidine HIS H
Glutamic acid
Aspartic acid
Histidine
Arginine
Lysine
20
Charged amino acids Glutamic acid
GLU E Aspartic acid ASP D Lysine LYS K Arginine
ARG R Histidine HIS H
Glutamic acid
Aspartic acid
pKa 6 1
Histidine
Arginine
Lysine
21
amino acids/proteins Glycine GLY G Alanine
ALA A Phenylalanine PHE F Leucine
LEU L Isoleucine ILE I Valine
VAL V Proline PRO P Methionine MET M Glutamic
acid GLU E Aspartic acid ASP D Glutamine GLN Q As
paragine ASN N Lysine LYS K Arginine ARG R Serin
e SER S Threonine THR T Tyrosine TYR Y Tryptopha
n TRP W Histidine HIS H Cysteine CYS C
Polar amino acids
22
Polar amino acids Glutamine GLN Q Asparagine ASN
N Serine SER S Threonine THR T Tyrosine TYR Y Tr
yptophan TRP W Histidine HIS H Cysteine CYS C
Glutamine
Asparagine
Serine
Threonine
Tyrosine
Tryptophan
Cysteine
23
?
(from Mount, Bioinformatics Sequence and Genome
Analysis
primary (1º) secondary (2º) tertiary
(3º) quaternary (4º)
From Brandon Tooze, Introduction to Protein
Structure
24
peptide bond is relatively rigid and
planar (significant barrier to rotation, 20
kcal/mol) trans peptide bond is favored over
cis by 103 (steric clash) Except PRO trans
favored by only 8020
-
O
C

N
H
The planar peptide bonds can rotate about the Ca
carbon ? and ? are 180o in the conformation
shown and increase in the clockwise direction
when view from the Ca carbon
25
Ramachandran plot Shows sterically allowed
conformational angles phi and psi Steric
interactions eliminate a large fraction of
possible comformations.
y
Brandon Tooze Introduction to Protein
Structure, Figure 1.7a
f
26
Brandon Tooze Introduction to Protein
Structure, Figure 1.7b.c From J. Richardson, Adv.
Prot. Chem. 34, 174-174 (1981)
All amino acids (except glycine) form high
resolution crystals
GLY much more freedom
27
amino acids/proteins ala, gly, leu, ile, val,
glu, asp, gln, asn, pro, lys, arg, ser, thr, tyr,
trp, phe, his, cys, met
Secondary structure
a-helix b-sheet loop or turn random coil
From Brandon Tooze, Introduction to Protein
Structure
28
Can we predict 2 structure?
29
Can we predict 2 structure?
-- helical propensities -- mining structural
databases for known sequence-structure
relationships -- sequence contexts? -- neural
nets, HMM
Yes! but accuracy is still poor ? PROF 81
http//www.aber.ac.uk/phiwww/prof/
30
Can we predict properties of unknown Proteins?
31
Can we predict properties of unknown Proteins?
Yes!
Parametric Sequence Analysis
Sequence LAKMVVKTAEAILKD
a Helix 3.6 amino acids per turn
10 Å (3 turns) 10 amino acids
N-HOC H-bond between every 4th
residue Transmembrane a Helix
19 hydrophobic AA long
32
Can we predict properties of unknown Proteins?
Helical Wheel Plots (rotation 100o)
33
ExPASy
http//ca.expasy.org/
34
Protscale Parametershttp//www.expasy.ch/tools/pr
otscale.html
I 4.5 V 4.2 L 3.8 F 2.8 C 2.5 M 1.9 A
1.8 G -0.4 T -0.7 S -0.8 W -0.9 Y -1.3 P
-1.6 H -3.2 E -3.5D -3.5 N -3.5 Q -3.5L
-3.9 R -4.5
Values averaged over a sliding window
35
Kyte-Doolittle of Leptin Receptor
1.7
36
CBS
37
CBS
38
SignalP 3.0 Server
http//www.cbs.dtu.dk/services/SignalP/
39
Visualizing Protein Structuresand Structural
Bioinformatics End of Tuesday Lecture
  • Forces Driving Protein Folding
  • Finding out more about structures
  • How to visualize molecules with VMD(HW)
  • How to edit Protein Data Bank files(HW)

40
Why does 2 structure form? Why do proteins
fold?
Secondary structure
a-helix b-sheet loop or turn random coil
From Brandon Tooze, Introduction to Protein
Structure
41
Why do proteins fold?
42
Why do proteins fold?
Digression Chemical forces intra- and inter-
molecular interaction
43
Why do proteins fold?
Digression Chemical forces intra- and inter-
molecular interaction intra covalent bonds
44
Why do proteins fold?
Digression Chemical forces intra- and inter-
molecular interaction intra covalent
bonds inter - Hbonds - Ionic - VW
interactions
45
Why do proteins fold?
Digression Chemical forces intra- and inter-
molecular interaction intra covalent
bonds inter - Hbonds - Ionic - VW
interactions Hydrophobic effect
46
Covalent interactions
Covalent interactions hold the peptide backbone
together.
47
Intermolecular interactions
N-H OC
lt 3.0 A
Note VMD requires H to find H-bonds. You can
add H to a pdb file with babel.
HBONDs is one of the strongest Intermolecular
Interactions. They are due to large molecular
dipoles between hydrogen and lone pairs (O N).
48
(from Mount)
Why dont salt bridges dominate as the force for
protein stability?
Why is the difference in free energy between a
folded and unfolded state only a few kcal/mol?
49
Intermolecular interactions
hydrophobic
hydrophobic
Hydrophobic interactions dominate all other
interactions. They are due to solvation effects.
50
What about solvation?
O-

Na
To form the salt pair, you must desolvate the
ions Solvating ions is often favorable
Desolvating ions is unfavorable
51
What about solvation?
O-

Na
To form the salt pair, you must desolvate the
ions Solvating ions is often favorable
Desolvating ions is unfavorable
Why are oils not very soluble in water?
octane, lipids,
Why do lipid bilayers and micelles form?
52
polar head group
Hydrophobic, greasy part saturated and
unsaturated hydrocarbon chains
phosphatidylcholine
53
Schematic of a lipid bilayer
DG favors association due to freeing of
water DH gt 0 (melting of ice)DS gt 0 (melting to
liquid)DG DH T DS lt 0
54
Hydrophobic and Hydrophilic Solutes
Isobutene
Urea
POLAR AND NON-POLAR SOLUTES have very different
effects on water structure. We show two solutes
that have the same Y-shaped geometry but
different partial charges. The polar solute, urea
(left), has partial charges on its atoms.
Consequently, it is able hydrogen-bond to water
molecules and to fit right into the water
hydrogen-bond network. In contrast, the non-polar
solute, isobutene (right), does not have
(substantial) partial charges on any of its
atoms. It, thus, can not hydrogen-bond to water.
Rather, the water molecules around it turn away
and interact strongly only with other water
molecules, forming a sort of hydrogen-bond ice
cage around the isobutene.
55
Summary of Protein Folding Interactions
Electrostatic Interactions
Hydrogen bonds -1.0 to -5.0 kcal/mol dipole-dipol
e interaction 1/r2 Short range, angle
dependent van der Waals -0.2 to -1.0
kcal/mol Induced dipole 1/r6 Very Short
range. Temperature dependent

Overlapping atoms repel, creates steric
effects Ionic 5.0 kcal/mol ion-ion
interaction 1/r least affected by
temperature and distance
Hydrophobic Interactions
Hydrophilic ?G ?H T ?S Solvation is
favored by ?H lt 0 reduces ion-ion
interactions Hydrophobic ?G ?H T ?S
Desolvation is favored by ?S gt 0
Surface area and Temperature dependent
56
Secondary structure
a-helix b-sheet loop or turn random coil
From Brandon Tooze, Introduction to Protein
Structure
57
i to i4 connectivity 3.6 residues per turn (phi,
psi) (-60º, -50º)
a-helix
58
CO --
i to i4 connectivity 3.6 residues per turn (phi,
psi) (-60º, -50º)
  • All NH and CO groups are connected
  • except for first NH group and last CO
  • ends are polar (hence usually at protein
    surface)
  • orientation of h-bonds leads to net dipole

NH
a-helix
59
i to i4 connectivity 3.6 residues per turn (phi,
psi) (-60º, -50º)
  • All NH and CO groups are connected
  • except for first NH group and last CO
  • ends are polar (hence usually at protein
    surface)
  • orientation of h-bonds leads to net dipole

Different side chains have different helical
propensities ALA, GLU, LEU, MET good a-helix
formers PRO, GLY, TYR, SER poor a-helix formers
a-helix
PRO is OK as a N-cap ASP is a good N-cap
(balances )
60
Left handed helix is also possible, (phi, psi)
(60, 60) i to i3 310 helix i to i5 p helix
i to i4 connectivity 3.6 residues per turn (phi,
psi) (-60º, -50º)
  • All NH and CO groups are connected
  • except for first NH group and last CO
  • ends are polar (hence usually at protein
    surface)
  • orientation of h-bonds leads to net dipole

Different side chains have different helical
propensities ALA, GLU, LEU, MET good a-helix
formers PRO, GLY, TYR, SER poor a-helix formers
a-helix
61
  • b-sheet structure
  • built up from combinations of several regions of
    polypeptide chain
  • b-strands are usually 5-10 residues long
  • b-strands are usually almost fully extended
  • b-strands are usually aligned adjacent to one
    another to form CO to N-H bonds between the
    strands
  • when several b-strands are together, this is
    called pleated (with Ca atoms above and
    below plane). antiparallel or parallel

Note the p-hydroxybenzylidene -imidazolidone
62
anti-parallel vs. parallel b-sheet
63
pleating in antiparallel vs. parallel b-sheet
64
2WUR Green Fluorescent Protein
65
loops and turns
a turn
b turn
W loops
Figure 2.8, Brandon Tooze
66
Loops and turns
3
2
b-turn i ? i 3
4
1
Figure 2.8, Brandon Tooze
67
protein structural motifs
Simple combinations of secondary structural
elements, Called structural motifs or
supersecondary structure.
Examples helix-turn-helix or helix-loop-helix
Figure 2.12 Brandon Tooze
68
protein structural motifs
Simple combinations of secondary structural
elements, Called structural motifs or
supersecondary structure.
Examples helix-turn-helix or helix-loop-helix h
airpin b
Figure 2.14 Brandon Tooze
69
protein structural motifs
Simple combinations of secondary structural
elements, Called structural motifs or
supersecondary structure.
Examples helix-turn-helix or helix-loop-helix h
airpin b Greek key
Figure 2.15 Brandon Tooze
70
protein structural motifs
Simple combinations of secondary structural
elements, Called structural motifs or
supersecondary structure.
Examples helix-turn-helix or helix-loop-helix h
airpin b Greek key beta-alpha-beta
Found almost in every protein structure with a
parallel b-sheet
2 possible hands left-handed connection (shown
on the right) has only been found in 1-protein
(subtilisin)
Figure 2.18 Brandon Tooze
71
protein structural Domain
Compact subunit of protein structure built from
motifs .
Pyruvate Kinase (1pkn) has three domains
72
Domains are built from structural motifs 4 main
classes of protein structures a-domains b-domai
ns (antiparallel b) a/b domains a b domains
38221 PDB Entries (23 Feb 2009). 110800
Domains. Class folds
superfamilies families All alpha proteins
284 507 871 All beta proteins 174
354 742 Alpha and beta proteins (a/b)
147 244 803 Alpha and beta proteins
(ab) 376 552 1055 Multi-domain
proteins 66 66 89 Membrane and cell
surface proteins 58 110 123 Small
proteins 90 129 219 Total 1195
1962 3902
73
Protein Data Bank
74
Protein Data Bank
75
Protein Data Bank
76
Protein Data Bank
77
Structural Classification Of Proteinshttp//scop.
mrc-lmb.cam.ac.uk/scop/
Hand-curated hierarchical taxonomy of proteins
based on their structural and evolutionary
relationships. Class Fold Level
Superfamily Family
Chothia, Murzin (Cambridge)
78
SCOP
Class Fold / Architecture Superfamily
79
Structural Classification Of Proteinshttp//scop.
mrc-lmb.cam.ac.uk/scop/
Right Handedbeta-alpha-beta
Flavodoxin (5NLL) Clostridium beijerinckii
Classes alpha and beta protein (a/b) Fold Level
Flavodoxin-like (3 layer b/a/b parallel five
strand beta-sheet) Superfamily
Flavoproteins Family Flavodoxin-related (Binds
flavin mononucleotide)
80
CATH Protein Structural Data Basehttp//www.bioch
em.ucl.ac.uk/bsm/cath/cath.html
Semiautomatic domain classification of PDB
crystal structures.Thornton (London) Class
Architecture Topology Homology
Sequence
81
UniProthttp//www.uniprot.org/
Search 2xgz
82
Pfamhttp//pfam.sanger.ac.uk/
83
Procitehttp//ca.expasy.org/prosite/
Protein families and domains base on Active Site
Motifs
  • Database of biologically significant sites,
    patterns and profiles that help to reliably
    identify to which known protein family (if any) a
    new sequence belongs.
  • Based on the observation that proteins can be
    grouped on the basis of similarities in their
    sequences (signature for a protein family or
    domain).
  • The protein signatures are provided in PROSITE
    format. This format can also be used to do
    similarity searching by using PHI-BLAST/NCBI.
  • Currently contains patterns and profiles specific
    for gt1000 protein families or domains. List of
    PROSITE entries.
  • Background information for each of these protein
    signatures is provided.
  • Can be used to find biologically significant
    regions within a protein.

84
Procite 2XGZ
342 - 355 LLLKvNQIGTLSES
Write a Comment
User Comments (0)
About PowerShow.com