Title: The University of Texas Health Science Center
1The University of Texas Health Science Center at
San Antonio, Department of Biochemistry Borries
Demeler, Ph.D.
Computational Challenges in Biophysics Two
Applications in Hydrodynamics
2Hydrodynamic Studies of Biological Macromolecules
- Using hydrodynamic approaches, we can model
transport processes Sedimentation and Diffusion - Analytical Ultracentrifugation
- Experimental Background
- Models and Parameterization
- Optimization
- Bead Modeling
- Examples
3Hydrodynamic Studies of Biological Macromolecules
- Why study the structure and function of
macromolecules and macro-molecular assemblies
with analytical ultracentrifugation (AUC)? - Molecules can be studied in a physiological
environment (pH,
concentration, ionic strength, oxidation state,
ligands, etc.)? - Molecules are not fixed to a microscope grid
- Molecules are not distorted by crystal packing
forces - Very large size range (102 108 Dalton,
complements cryo-EM and NMR)? - Several detectors available (Rayleigh
Interference, UV/VIS, Fluorescence Emission,
Schlieren, Turbidity, MWL, Raman, SALS)? - Dynamic processes can be studied
- Conformational changes, reversible
self-association, binding strengths, slow
kinetics - Composition analysis
- Partial concentration, molecular weight, shape
- First Principles approach
4Hydrodynamic Studies of Biological Macromolecules
- Why study the structure and function of
macromolecules and macro-molecular assemblies
with analytical ultracentrifugation (AUC)? - Molecules can be studied in a physiological
environment (pH,
concentration, ionic strength, oxidation state,
ligands, etc.)? - Molecules are not fixed to a microscope grid
- Molecules are not distorted by crystal packing
forces - Very large size range (102 108 Dalton,
complements cryo-EM and NMR)? - Several detectors available (Rayleigh
Interference, UV/VIS, Fluorescence Emission,
Schlieren, Turbidity, MWL, Raman, SALS)? - Dynamic processes can be studied
- Conformational changes, reversible
self-association, binding strengths, slow
kinetics - Composition analysis
- Partial concentration, molecular weight, shape
- First Principles approach
5AUC Experimental Setup
6(No Transcript)
7AUC Background Experiment at Rest
At Rest rotor speed 0
8AUC Background Sedimentation Velocity
Sedimentation Velocity Duration is
hours high rotorspeed
At Rest rotor speed 0
9AUC Background Sedimentation Equilibrium
Sedimentation Velocity Duration is
hours high rotorspeed
Sedimentation Equilibrium Duration
is gt 1 day low rotorspeed
At Rest rotor speed 0
10Sedimentation Velocity
Sedimentation velocity profile of a mixture of
macromolecules over time
11Sedimentation Velocity
Composition Analysis We need to answer these
questions
12Sedimentation Velocity
Composition Analysis We need to answer these
questions
- How many components?
- What are their molecular weights?
13Sedimentation Velocity
Composition Analysis We need to answer these
questions
- How many components?
- What are their molecular weights?
- What are their shapes?
14Sedimentation Velocity
Composition Analysis We need to answer these
questions
- How many components?
- What are their molecular weights?
- What are their shapes?
- What is the partial concentration of each
component?
15Sedimentation Velocity
Composition Analysis We need to answer these
questions
- How many components?
- What are their molecular weights?
- What are their shapes?
- What is the partial concentration of each
component? - What is the reliability of our measurement?
16Initialization of Genetic Algorithms
Diffusion coefficients are randomly assigned
based on a reasonable range from the frictional
ratio k f/f0 parameterization
k 1.0
1.0 k 4.0 for most proteins, higher for
rod-shaped and unfolded proteins, DNA, fibrils
and aggregates or linear molecules
k 1.2 - 2.5
k gt 3
17(No Transcript)
18Optimization Approach 1
Solving the inverse problem of finding the
parameters for the finite element solution of the
Lamm equation that best reconstructs the
experimental data Solution Use a stochastic
optimization approach Example Genetic
Algorithms
19Genetic Algorithms (GA)
- Genetic algorithms (GA) provide a
- stochastic optimization method
- Evolutionary paradigm for adjusting parameters
- Mutation, recombination, deletion, insertion,
crossover operators - Random number generators are used to manipulate
operators - Generational Model survival of the fittest
(...fitting function)? - Generation ? iterations, genes ? parameter
strings, bases ? s, D - J.H. Holland, Adaption in Natural and Artificial
Systems, 1975, U. of Michigan Press - J.R. Koza, Genetic Programming On the
Programming of Computers by Means of Natural
Selection, 1992, MIT Press
20GA genes
Genes are strings of parameters, each gene
consists of a pair of corresponding
sedimentation and diffusion coefficients.
S1 S2 S3 ... Sn
Gene
D1 D2 D3 ... Dn
Component n
Component 3
Component 2
Component 1
21(No Transcript)
22Mutation
Generation 1
Generation 2
S1a S2a S3a ... Sna
S1a S2a S3a ... Snc
Gene A
D1a D2a D3a ... Dna
D1a D2a D3a ... Dna
Mutation
S1b S2b S3b ... Snb
S1b S2b S3b ... Snb
Gene B
D1b D2b D3b ... Dnb
D1b D2c D3b ... Dnb
23(No Transcript)
24(No Transcript)
25Genetic Algorithm Implementation
- Concentration values of components j in each gene
G are determined with NNLS (a linear fitting
approach) - Mutation/Crossover/Recombination operators are
applied - Progeny is calculated and this process is
iterated - Deme migration and regularization rates are
applied - Typically 100 individuals/deme
- 30-50 generations leads to convergence
- Lawson, C. L. and Hanson, R. J. 1974. Solving
Least Squares Problems. Prentice-Hall, Inc.
Englewood Cliffs, New Jersey
26Optimization Approach 2
Solving the inverse problem of finding the
parameters for the finite element solution of the
Lamm equation that best reconstructs the
experimental data Solution Linearize the
problem Example 2-dimensional Spectrum Analysis
27(No Transcript)
282-D Spectrum Analysis
Blue initial grid points. Perform NNLS and
save non-zero coefficients.
292-D Spectrum Analysis
NNLS result Blue zero Red positive value
302-D Spectrum Analysis
1. Filter nonzero results 2. Refine grid 3.
Perform Monte Carlo 4. Initialize GA parameter
space 5. Use GA analysis to increase
parsimony 6. Use Monte Carlo to reduce signal
from noise
31Final Result is used to initialize GA
Idea Build probability surfaces around each
non-zero entry and use the surface to initialize
the GA. Surfaces can be circular, elliptical, or
rectangular Probabilities from neighboring
points add up.
322-D Spectrum Analysis refinement - Example
- Simulate a 5-component system with heterogeneity
in shape and mass - Add stochastic noise equivalent to instrument
332-D Spectrum Analysis refinement - Example
- Final result is not parsimonious doesn't
satisfy Occam's razor - Solution is over-determined
- Noise contributes to false positives
34Implementation of Monte Carlo Method
- The Monte Carlo method is a stochastic approach
that can be used to identify the effect noise has
on the reliability of determined parameters. With
the Monte Carlo approach the statistical
confidence limits of each measured parameter can
be determined. - Recipe for Monte Carlo
- Obtain a best-fit solution from regularized GA
fit and confirm that the residuals are random and
without systematic deviation - Generate new synthetic Gaussian noise with the
same quality as was observed in the original
experiment and add it to the best-fit solution - Re-fit the solution
- Repeat (2-4) at least 100 times and collect all
parameter values - Calculate statistics from Monte Carlo
distribution for each parameter
352-D Spectrum Analysis refinement - Example
- Perform 2DSA Monte Carlo analysis to amplify
signal linearly - Stochastic noise only amplifies with
362-D Spectrum Analysis - Refinement
- Stochastic noise signals disappear when frequency
is plotted - Sample signal is amplified
37(No Transcript)
38Genetic Algorithm Analysis - Refinement
- Genetic Algorithm produces parsimonious solution
- Still affected by stochastic noise
39(No Transcript)
40Global Genetic Algorithm Monte Carlo Analysis
- Add low-speed data to emphasize diffusion signal
- Perform global GA Monte Carlo analysis
41High-Performance Computing Implementation
The 2 dimensional Spectrum Analysis and the
Genetic Algorithms present several
parallelization targets
- Calculation of finite element models for
noninteracting solutes - Calculation of individual genes
- Parallelization of NNLS
- Calculation of each subgrid
- Calculation of individual demes
42Bioinformatics Core Facility (BCF) Cluster Master
Node
UltraScan LIMS DB
Webserver
Local Supercomputer
TIGRE HPC Grid
43(No Transcript)
44(No Transcript)
45CuZn Superoxide Dismutase mutant - freshly
purified hSOD mutant protein (Data kindly
provided by P.J. Hart and Sai Venkatesh
Seetharaman, UTHSCSA/Biochemistry)?
46CuZn Superoxide Dismutase mutant - After 7 days
showing clear signs of degradation (monomer
species is visible) and aggregation and
unfolding. Aggregation is proceeding in an
end-to-end fashion forming a fibril-like
conformation. Dimer peak is showing several
conformations. (Data kindly provided by P.J. Hart
and Sai Venkatesh Seetharaman, UTHSCSA/Biochemistr
y)?
47Mo30Fe72 (Keplerate)?
48A
B
B
D
C
Example Clathrin baskets assembling from
clathrin triskelia (A). The sample also displays
several nonglobular species which represent the
building block subunits required for assembly of
intact baskets (B, D). Sample shows a large
heterogeneity of different sized baskets that
assume a mostly globular form with a unity
frictional ratio (B,C). (Data kindly provided by
E. Lafer, UTHSCSA, Dept. of Biochemistry)?
49(No Transcript)
50(No Transcript)
51eNOS-NOSIP Binding Experiment GA/MC analysis
ENOS-NOSIP 10
ENOS-NOSIP 11
ENOS-NOSIP 10
ENOS-NOSIP 15
ENOS-NOSIP 110
52UltraScan Software
- Open Projects
- Analysis Software
- Add interacting models to GA, 2DSA for
hetero-associating systems and reversibly
self-associating systems, develop new
optimization methods - Implement Monte Carlo methods for accurate
confidence limits - Enhance Web-based application interfaces
- Global analysis
- Additional parallelizations
- Software Developments for new Detectors
- Multiwavelength UV/Visible detectors (project is
in collaboration with Dr. Helmut Cölfen, Max
Planck Institute, Berlin, Germany)?
53Project 2 Bead Modeling
Use Bead Modeling to represent atomic structures
from NMR and X-ray Crystallography and calculate
hydrodynamic parameters for the model from an
assembly of beads (Project in Collaboration with
Dr. Mattia Rocco, Italian Cancer Research
Institute, Genova, Italy)?
54(No Transcript)
55(No Transcript)
56Program SOMO (SOlution MOdeller) generating
medium-resolution bead models from atomic
coordinates
Main features 1 bead/side chain 1 bead/peptide
bond. Water of hydration included in each bead.
A?B Exposed side -chains beads are placed.
B?C Beads overlapping by more than a preset
threshold can be fused together. Overlaps are
then removed hierarchically, reducing the radii
and outward translating the centers of exposed
beads.
C?D Exposed peptide bond beads are placed and
overlaps removed.
D?E Buried beads are placed and overlaps removed.
They should be excluded from the computations of
the hydrodynamic parameters.
57Program UltraScan SOMO (SOlution MOdeller)
Front-end GUI Used to enter new residues and to
define beads
58Program UltraScan SOMO (SOlution MOdeller)
- Open Projects
- Introduce flexibility between beads consistent
with bond constraints and model conformational
heterogeneity and Brownian motion. - This will provide translational diffusion
coefficients, and rotational diffusion
coefficients from trajectories. - Software needs to be developed to implement
parallel Monte Carlo approaches for simulation
59Acknowledgements
- Department of
- Applied Mathematics
- Dr. Weiming Cao
- Department of
- Computer Science
- Dr. Raj Boppana
- Emre Brookes
UTSA
Warren Smith, Ashok Adiga the rest of the good
folks at TACC, the TIGRE Team from HIPCAT, and
all the TIGRE sites that spent time to provide us
with access help to get the software working.
Max-Planck Institute for Colloids and Interfaces
- Department of Biochemistry
- Virgil Schirf
- Jeremy Mann
- Yu Ning
- Bruce Dubbs
- Dan Zollars
- Funding
- National Science Foundation
- NSF Teragrid
- San Antonio Life Science Institute
- Howard Hughes Medical Institute
- UT Permanent University Fund