Title: Coarse Graining Structures
1Coarse Graining Structures
- Extracting Interactions from Structures
Potentials - Threading Building/Predicting Structures Based
on Similarities - Protein-Protein Networks (Interactome)
- Control of Protein Motions by Structure
- Large Simulations
2Coarse-grained protein structures
- 1. Treat residues each as spheres
- Based on
- 1 point per residue
- Orientation distribution
- Select conformations based on hydrophobicity
- 2. Consider more details
- Packing
- Directionality
- Size
- Select conformations based on charges and packing
- Separation into interactions and geometry
3Reducing the Combinations of Protein Conformations
- 1. Chain generation
- Spheres imply face-centered cubic lattice more
efficient for conformation generation than
non-lattice - 2. Orientations
- Begin at surface
- Fewer choices if uncharged
- Close charge pairs favor directions
- a\-
- \ a
4Data Mining Structures to Extract Interaction
Details
Residue-residue interactions from frequencies of
close pairs Angular distributions of packing
geometries --------- Re-evaluation of
residue-residue interactions, Weighted by
sequence conservation Larger groups of
interacting residues Triplets, etc. (Nucleic
Acids and Nucleic Acids-Proteins)
5Empirical Potential Functions
Sources
- Structures
- Binding Constants
- combinatorial syntheses
- Sequences
6Residue-residue Potential Derivation
i-0 j-0 D i-j 0-0 I,j are residue types 0
is solvent Equilibrium constant formulation
Underlying assumption of equilibration
reflecting energetics of interactions
Sphere - Centered on Each Residue's Side
Chain Count Types of Residues as Non-Bonded
Neighbors Count for All Proteins in the
Set of Crystal Structures Equilibrium Constants
and the Implied Effective Interaction
Energies for Residue Pairs
7Residue-Residue Interaction Energies
Protein 2 types of residues
In Sphere Counts 2 2 2 Equilibrium
Constant Boltzmann Law Exp(-e ) n
n n n
Interaction Sphere - Centered on each Amino
acid Complete Sphere with Solvent
Spheres Count Residue Neighbor Types
X
Corresponds to reaction
?
8Residue-Residue Interactions
- From
- frequencies of
- contacts in set
- of crystal
- structures
- Exp(-eij)
- Nij N00
- Ni0 Nj0
9Major Features of Potentials
- 2 Types of Residues
- H Hydrophobic (green)
- P Polar (red)
- HH lt HP lt PP
- gives globule
- HH, PP lt HP
- gives segregation
10Residue-residue Interactions
All negative because of condensed state Relative
Strengths
Hydrophobic-Hydrophobic Strong
Hydrophobic-Hydrophilic Hydrophilic-Hydrophilic
Weak
11Empirical Potentials
- Globular Proteins - Long Range
- Overall
- More detailed, more specific
- Globular Proteins - Short Range
- Protein-Ion Interactions
- Protein - Peptide Binding
- Protein-DNA Binding
12Applications of Potentials
- Evaluation of protein folds, models and
threading - Location of peptide binding sites
- Amino Acid substitution matrix
- Stabilities of proteins with substitutions
- Aids for Protein Design
13Categories of Interactions
- Hydrophobic interactions - strong/ not specific
- Hydrophilic interactions - specific/ not strong
- Water interactions mostly exterior
- Ion - residue interactions - strong specific
- advantage of nucleus with strongly directional
interactions - demonstrated to reduce variations
- utility in protein design
14How to Make Homology Models of New Structures
- Sequence Matching Similar Sequence Implies
Similar Structure - What to do with Gaps and Insertions?
- Threading with Interaction Potentials Includes
Evaluation of Residues Environments
15Threading to Solve Sequence-Structure Alignment
A
C
B
Use contact potentials to score threadings
A. Torda Proteomics Handbook
16Networks as a Unifying Model Connecting Atomic
to the Larger Scale
- Biomechanics
- Genomes
- Proteomes
- Metabolomes
- Structures
- All Combinations Integrating for Systems
Approaches - Different Levels of Abstraction
- Spatial, Temporal
Huge Conceptual and Computational Challenge to
Connect All
17Protein Interactome in Yeast
red - cellular role and subcellular location
agree blue - locations agree green - cellular
roles agree
Huge networks! Full networks may have 30,000
nodes
Schwikowski, et al Nature Biotech 2000
18Protein-protein interaction networks
Create a Kirchhoff Contact Matrix Cluster
Analyses - Eigenmodes Functionally Related
Proteins in Each Cluster Simplest Analysis
0s and 1s
Sen et al., BMC Bioinformatics, 2006
19Significant proteins in each cluster are
interconnected!
Sen et al., BMC Bioinformatics, 2006
20Functional Modules Used to Infer New
Interactions
- New interaction for TOP1
- identified in newer version
- of GRID Database
- ARP1 is also likely to
- interact somewhere in
- this cluster
Suggests Followup Experiments
21Even Small Subnets May Have Missing Links
- URA10 - pyrimidine base biosynthesis with
orotate phosphoribosyltransferase activity - RPS20 - structural constituent of the ribosome
and participates in protein biosynthesis - GPI13 - phosphoethanolamine activity and
participates in GPI anchor biosynthesis
Also Others in Cluster Not Connected Should
They Be?
22Cluster for DNA Metabolism, Cell Organization and
Biogenesis
- 4 central proteins have no interactions with
others - Duno et al. indicate SGS1 essential in absence
of TOP1 - Implies TOP1 and SGS1
- should have same
- interacting partners?
Implication of Other Interactions from Functional
Replacement
23Questions about Protein Structures
- Functional part can be a small part whats the
rest of the structure for? - How does allostery work (effect at a distance)?
- Understanding protein control reactions and
processing - How cooperative are protein motions?
-
- Protein Structure to Control Functional Motions
Similar Questions throughout Systems Biology
24Coarse-Graining StructuresSimplifying Large
Structures for Simulations
- Complexity requires use of simple approaches a
network view of molecular structure
(hydrophobicity cohesiveness) - Identification of physical interactions
(proximity) - Usually no chemical identities of atoms or
residues - A physics/materials/polymers/engineering based
approach - Essential for the cohesiveness of structures
- Many Aspects of Protein Behavior Are Captured by
Coarse-Grained Models
25Protein Structures as NetworksDensely Packed
Cooperative Models A Manifestation of
Hydrophobicity
Ribbon Diagram Elastic Network
Similar to Small World Social Networks A High
Packing Density Model
26Elastic Network Molecular Models
- Rubber elasticity (polymers - Flory)
- Intrinsic motions of structures (Tirion 1996)
- Simple elastic networks of uniform material
- Appropriate for largest, most important domain
motions of proteins - independent of structural
details - High resolution structures not always needed
- Macromolecules as Rubbery Bodies
- Yields Well Defined, Highly Controlled Motions
27Interpretation of B-Factors (X-ray) and NMR
Ensembles
- Usually Observe Larger Motions on Surfaces
- Possible Interpretation of Displacements
- Structures move as fully rigid domains in hinge
and other motions so surface motions are larger
but rigid - Surfaces are highly flexible independently
- Surface motions are controlled by domain motions
- Control Interdependences of Distant Parts of
Structure, difficult to obtain with atomic models - A model for allostery
28Applications of Elastic Network Models
- Motions of largest structural assemblages by
coarse graining - Functional mechanisms for processing proteins and
enzymes - Predicting pathways for transitions between two
forms of a protein - Interpreting single molecule pulling experiments
- Refinement of structures and models
- A Simplifying View of Protein Motions
29Elastic Network Model Calculating Protein
Position Fluctuations
- Vtot(t) (g/2) tr DR(t)T G DR(t)
- ltDRi . DRjgt (1/ZN) ? (DRi . DRj) exp
-Vtot/kT dDR - (3kT/g) G-1ij
G Kirchhoff matrix of contacts -
G
Normal Modes Tell about Fluctuations and Their
Correlations
30Supporting Evidence for Model
- Reproduce B-Factors Better than Atomic Molecular
Dynamics - Reproduce Motions Represented in NMR Ensembles
Extremely Well - Motions Closely Related to Observed Structure
Changes, Including Ligand Binding - Mutational Analysis Confirmatory
- Strong Support for Elastic Models
31Validation of ENM against Motions in Multiple
Structures - HIV Protease X-ray Structures
Principal Component 2 (164 structures)
Mode 3
Correlation 0.64
Intrinsic Calculable Range of Distortions for
Drug Binding
32NMR Structures Fit Elastic Networks Better than
X-Ray Structures
Overlaps between directions of motions
Results for 164 X-ray and 28 NMR HIV Protease
Structures
33Cumulative Overlaps with NMR Motions
Agreement Better than for X-ray
34Validation from X-ray Temperature Factors
100
(b) 1omf
calculated
75
experimental
Debye-Waller factors Bk 8 ?2 lt?Rk ? ?Rkgt /3
50
25
0
0
50
100
150
200
250
300
350
- These Results Usually Better than with Atomic MD
35Effects of Gln Synthetase Binding tRNA
Temperature Factor
nucleotide number
Major Changes from Binding Are Reproduced
36tRNA Motions
37Reverse Transcriptase Mechanism from Modes of
Motion
1. Push-pull hinge
2 Slowest Modes of Motion Relate to mechanism
of the processing motion base by base
2. Open-close hinge
Construct Processing Mechanism by Combining These
Motions
38HIV Reverse Transcriptase Slowest Motion
Characteristic Hinge Motion
39Slowest Motions of Tubulin Dimer
opposite rotation of monomers wobble between
monomers stretching- compression along long
axis
Most Flexible Subunit Interface Helps Motion
along Fibril
40Motions near GTP Binding SiteExtent of
Cooperativity and Cohesiveness?
Segment 206-224 Moving red/gray/green
Loop motion permits binding unbinding of GTP
Local Motions of Small Parts Depend on Whole
Structure
41Mode Contributions for Xanthine Dehydrogenase
Log-log plot Important motions similar for
different levels of coarse graining
Only a Small Number of Important Motions
42Superimposed Triosephosphate Isomerase Structures
TIM green, TPH white, large loop in
red Residues 130-248 large changes treated
here as atoms
43Agreement of Slow Mode Motions in Triosephosphate
Isomerase with Known Changes
- The Tip of Loop 6 (Thr172) Displaced More than 7
Ã… - Catalytic Base Glu165
- Moves 2 Ã… to Force the Ligand into a Planar
Form - Indole Ring of Trp168 Rotates about 50
- Permits Choice between Different Proposed Enzyme
Mechanisms
Computed Motions Support One Enzyme Mechanism
44First Mode Shape
Computed for Whole Structure
Computed for Fragment Only
Structural Context Matters - Requires Whole
Structure
45Ratchet Motions in the Ribosome
Wang, Y et al. J. Struct. Biol., 2004
46Whats Happening Inside?
Efficient Conversion of Rotational Motion to
Translational Motion
47Correlations of Motions between Ribosome
Components
for 100 slowest modes
Correlations as Expected for Processing
48Local Motions of tRNA
Anti-codon rigid
Acceptor end rigid
Functional Parts Are Held Rigid No Deformations
49Much of tRNA Motion in Ribosome Is Rigid Body
Wang et al., Biophys J, 2005
Virtually No Internal Motion
mRNA most flexible yet most rigid inside
ribosome
50Conclusions
- Protein structure (shape) probable motions
for sampling - Various level of coarse-graining models - OK
- Usually must have full assemblage not partial
structures! - Large domain motions dominate simpler,
functional motions not so many important ones - Atoms loops on surfaces can be controlled by
domain motions atoms pushed in directions of
enzyme reactions - Useful for large conformational transitions (Not
Shown) - Details of mechanisms and control - from modes of
motion
51Regulation of Protein Function through
Protein-Protein Networks
- Identify and Characterize Binding Site
Structures - Assemble Structures All Details Not Needed
Perhaps Binding to One Side or Another Is
Sufficient - Simulate to Learn Effects of Assembling
Structures on Motions - Identify Sites for Repressing or Enhancing
Functional Motions - Proteins as Drugs
Protein Regulation Efficiencies and Rates
Controlled by Binding Partners
52Other Applications
- Single Molecule Pulling Experiments Use Elastic
Network Models to Predict the Sequence of
Breaking Interactions - Molecular Mechanisms
- Simulations from Images (EMs)
- Protein-Protein Network Analyses Detect Errors
- Effects of Ligands
- Effects of Forces
- Effects of Disordered Parts of Proteins
- Simulations
- From images
- From diagrams
- Microarrays and High-Throughput Data Analyses
53Cell Simulations How Many Molecules?Apparent
Need for Coarse Graining
- 109 proteins (20 of total weight)
- 105 mRNA molecules (2)
- 1013 water molecules (70)
- DNA ( 1)
- Small molecules, lipids (7)
A Long Range Goal