Macromolecular Modeling and Simulation: Problems, Approaches, Challenges - PowerPoint PPT Presentation

1 / 56
About This Presentation
Title:

Macromolecular Modeling and Simulation: Problems, Approaches, Challenges

Description:

... crystal is put into an X-ray equipment to make an X-ray diffraction image. The diffraction image can be used to determine the three-dimensional structure of ... – PowerPoint PPT presentation

Number of Views:296
Avg rating:3.0/5.0
Slides: 57
Provided by: Zhij
Category:

less

Transcript and Presenter's Notes

Title: Macromolecular Modeling and Simulation: Problems, Approaches, Challenges


1
Macromolecular Modeling and Simulation Problems,
Approaches, Challenges
Zhijun Wu Department of Mathematics Graduate
Program on Bioinformatics and Computational
Biology Iowa State University May 8,
2001 Laboratory of Science and Engineering
Computing Chinese Academy of Sciences
2
  • Introduction
  • DNA, RNA, Protein
  • Structure Prediction and Determination
  • Theoretical and Experimental Approaches
  • Molecular Dynamics Simulation
  • Physical and Mathematical Models
  • Numerical Simulation
  • Simulation of Folding / Misfolding
  • Potential Energy Minimization
  • Potential Energy Functions
  • Global Minimization
  • Global Smoothing and Continuation
  • NMR Distance-Based Modeling
  • Molecular Distance Geometry Problem
  • Geometric Properties
  • Least-Squares Formulation
  • X-ray Crystallography Computing
  • Phase Problem
  • Entropy Maximization for Phase Estimation

Outline of the Talk
3
Biological Building Blocks DNA, RNA, Protein
DNA
GAA GTT GAA AAT CAG GCG AAC CCA CGA CTG
RNA
GAA GUU GAA AAU CAG GCG AAC CCA CGA CUG
Protein
GLU GAL GLU ASN GLN ALA ASN PRO ARG LEU
4
Protein Folding
LEU
ARG
ASN
PRO
ALA
ASN
GLN
GLU
GLU
VAL
GLU
GLU
ASN
VAL
LEU
ARG
PRO
ASN
ALA
GLN
. . .
5
HIV Retrotranscriptase
554 amino acids
4200 atoms
6
HIV Retrotranscriptase
4200 atoms
554 amino acids
7
Structure Prediction and Determination
Molecular Dynamics Simulation
Potential Energy Minimization
Nuclear Magnetic Resonance
X-ray Crystallography
8

Molecular Dynamics Simulation
  • The step size has to be small in
  • femtosecond to achieve accuracy.
  • Current computing technology
  • can make only picoseconds to
  • nanoseconds of simulation,
  • while protein folding may take
  • seconds or even longer time.
  • Molecular dynamics simulation
  • has been used successfully for
  • the study of other types of
  • dynamical behavior of protein.

Folding can be simulated by following the
movement of the atoms in protein according to
Newtons second law of motion.
9

Potential Energy Minimization
  • A reasonably accurate potential
  • energy function needs to be
  • constructed.
  • Given such a function, a local
  • minimizer is easy to find, but
  • a global one is hard, especially
  • if the function has many local
  • minimizers. No completely
  • satisfactory algorithm has been
  • developed yet for minimizing
  • proteins.

Hypothesis Protein native structure has the
lowest or almost lowest potential energy. It can
therefore be located at the global energy minimum
of protein.
  • Potential energy minimization
  • has been used successfully for
  • structure refinement though.

10

NMR Structure Determination
  • 15 of the structures in PDB
  • Data Bank were determined by
  • using NMR spectroscopy.

The NMR approach is based on the fact that nuclei
spin and generate magnetic fields. When two
nuclei are close their spins interact. The
intensity of the interaction depends on the
distance between the nuclei. Therefore, the
distances between certain pairs of atoms can be
estimated by measuring the intensities of the
nuclei spin-spin couplings.
  • Not all distances between pairs of
  • atoms can be detected. In
  • practice, only lower and upper
  • bounds for the distances can be
  • obtained also.
  • Structure can be determined by
  • solving a distance geometry
  • problem with the distance data
  • from the NMR experiments.

The distance data obtained from the NMR
experiment can be used to deduce the structural
information for the molecule. One way of
achieving such a goal is based on molecular
distance geometry.
11

X-ray Crystallography Computing
  • 80 of the structures in PDB
  • Data Bank were determined by
  • using X-ray crystallography.
  • The process is time consuming,
  • and some proteins cannot even
  • be crystallized.

In X-ray crystallography, protein first needs to
be purified and crystallized, which may take
months or years to complete, if not failed.
  • A mathematical problem, called
  • the phase problem, needs to be
  • solved before every crystal
  • structure can be fully determined
  • from the diffraction data.

After that, the protein crystal is put into an
X-ray equipment to make an X-ray diffraction
image. The diffraction image can be used to
determine the three-dimensional structure of the
protein.
12
Molecular Dynamics Simulation
Physical Model
Mathematical Model
13
Molecular Dynamics Simulation
Numerical Solution
Computer Simulation
14
Simulation of Folding -- An Initial Value Problem
Time Step femtoseconds, Folding seconds or
longer
15
Simulation of Misfolding -- A Boundary Value
Problem
Ron Elber 1996 Stochastic Path Integration /
Parallel Multiple Shooting
16
Potential Energy Minimization
minimization of potential energy in conformation
space
local / global optimization nonlinear,
unconstrained, continuous
example
Lennard-Jones
17
Protein Energy Function
18
Global Smoothing and Continuation

19
Gaussian Transformation

Scheraga, et al, 1989, 1992 Shalloway,
1992 Straub, et al, 1996 Wu, 1996 Moré and Wu,
1997
20
Statistical Averaging

21
Geometric Smoothing

22
Some Simple Transformation
23
High-Dimensional Transformation
24
Transformation of Potential Functions
25
NMR Distance-Based Structural Modeling
Bond Lengths / Angles
NMR Distance Data
Structure
26
Molecular Distance Geometry Problem
Given distances between certain pairs of atoms in
the molecule, find the coordinates of the atoms.
27
Graph Embedding
Given a weighted graph G (V, E, W), where Vvi
i1,,n, E(vi,vj) (i, j) in S, and
Wwi,jw (vi,vj) (vi,vj) in E,
v2
x2
5
3
5
3
v1
v3
x1
x3
4
4
28
Under-Determined System
kn -- total number of coordinates k(k1)/2 --
k translations, k(k-1)/2 rotations
29
Over-Determined System
30
Inconsistent Data
v2
x2
8
3
8
3
x3
v1
v3
x1
x3
4
4
Triangular inequality, c a b, may be
violated!
31
Flexible Structures
v2
v4
x2
x4
v1
v3
x1
x3
v2
v4
x2
x4
x2
x4
v1
v3
x1
x3
The structure can be deformed continuously
without violating any distance constraints.
32
Rigid Structures
v2
v4
x2
x4
v1
v3
x1
x3
v2
v4
x2
x4
x2
x4
v1
v3
x1
x3
The structures cannot be deformed any more!
33
Reflections
x2
4
x4
rigid unique?
5
3
3
d
v2
x1
x3
4
3
5
4
v4
3
d
v1
v3
4
x2
x4
4
5
3
3
d
x1
x3
4
Hendrickson 1991
34
Algorithms and Complexity
When all distances are available
can be solved in P
When only a subset of distances is available
NP-complete
35
If all distances are given, the problem can be
solved in polynomial time.
36
xi
xi2
xi1
If for every i, all distances between atoms i,
i1, i2, i3 are given, a solution can be found
in polynomial time.
xi3
37
xi
xk
xj
xl
Let atoms i, j, k be three atoms not in the same
line. Then xl can be determined for any l ? i, j,
k in constant time, if all distances between atom
l and atoms i, j, k are given.
38
x3
x4
x2
x5
x6
x1
In general, for an arbitrary S, the problem is
NP-hard (Saxe 1979).
x8
x7
x5
x4
x3
x2
x1
x8
x7
x6
39
e-Optimal Solutions
40
It is NP-hard to obtain an e-approximate
solution to the distance geometry problem when e
lt 1 / 2n, where n is the number of the atoms.
-- More
and Wu 1996
41
Least-Squares Formulation
42
Inexact Distance Data
43
X-ray Crystallography

X-ray beam
Protein crystal
X-ray diffraction
44
Electron Density Distribution
45
Magnitudes, Phases, Diffraction Intensities
46
The Phase Problem

Given the magnitudes of the structure
factors, find correct phases that define the
electron density distribution function of the
crystal system.
47
Direct Methods
Nobel Prize 1985
  • Karle and Hauptman (1950s)
  • nonlinear least squares
  • joint probability distribution
  • successful for small molecules
  • Bricogne (1984, 1988, 1993, 1997)
  • Bayesian statistical approach
  • statistical mechanics / information theory
  • apply to macromolecules

48
Entropy Maximization for Statistical Phase
Estimation

49
The Lagrangian

50
The Dual Problem
51
The Dual Problem

52
Newtons Method
53
A Fast Newtons Method (Wu, Phillips, Tapia,
Zhang 2001)

Sherman-Morrison-Woodbury Formula
54
Newton Step

55
  • The Fast Newtons algorithm converges to the
    solution to the entropy maximization problem
    quadratically and in each iteration, requires
    only O(n log n) floating point operations.

56
  • Introduction
  • DNA, RNA, Protein
  • Structure Prediction and Determination
  • Theoretical and Experimental Approaches
  • Molecular Dynamics Simulation
  • Physical and Mathematical Models
  • Numerical Simulation
  • Simulation of Folding / Misfolding
  • Potential Energy Minimization
  • Potential Energy Functions
  • Global Minimization
  • Global Smoothing and Continuation
  • NMR Distance-Based Modeling
  • Molecular Distance Geometry Problem
  • Geometric Properties
  • Least-Squares Formulation
  • X-ray Crystallography Computing
  • Phase Problem
  • Entropy Maximization for Phase Estimation

Summary
Write a Comment
User Comments (0)
About PowerShow.com