Title: Solution NMR Structure Calculation and Automated NOESY Spectral Analysis using RADAR
1Solution NMR Structure Calculation and Automated
NOESY Spectral Analysis using RADAR
- Torsten Herrmann
- Eidgenössische Technische Hochschule Zürich,
Switzerland - The Scripps Research Institute, CA, USA
2Content
- Protein structure determination by NMR
- Conformational constraints
- Hybrid energy function
- Algorithms for 3D structure calculation
- Structure calculation using DYANA
- Molecular dynamics simulation
- Torsion angle dynamics (TAD)
- Distance information from NOEs
- Automated structure determination using RADAR
- NOE assignment problem
- NOE assignment with CANDID
- NOE identification with ATNOS
- Criteria to judge correctness of resulting 3D
structure
3Protein Structure Determination by NMR
Protein Sample
NMR spectroscopy
Processing of NMR data
Resonance assignment
Conformational constraints
3D protein structure
Structure analysis refinement
4Conformational constraints
- NMR provides indirect information about 3D
structure - chemical shifts ? torsion
angles - coupling constants ? torsion angles
- NOEs ? ? interproton
distances - residual dipolar couplings ? bond orientation
- NMR data describes local conformation of the
protein. The dense network of constraints yields
the protein 3D structure.
5Hybrid energy function
- Structure calculation minimization of hybrid
energy function (target function) which
incorporates - experimental NMR data
- a priori information (force field)
Ehybrid ? wi Ei wbondEbond
wangleEangle wdihedral Edihedral
wimproperEimproper wvdWEvdW
wNOEENOE wtorsionEtorsion ...
6Algorithms for 3D structure calculation
- Metrix matrix distance geometry
- DISGEO
- Variable target function approach
- DISMAN, DIANA
- Simulated annealing using cartesian coordinates
- XPLOR
- Simulated annealing using torsion angles
- XPLOR, CNS, DYANA
7Solution NMR structure calculation
- DYANA torsion angle dynamics algorithm
8Molecular dynamics simulation
- MD numerically solves Newtons equation of motion
in order to obtain a trajectroy for the molecular
system. - Standard MD tries to simulate the behaviour of
a real physical system as close as possible. - MD used for NMR structure calculation searches
the conformational space of the protein for the
3D structure that fulfills all the restraints - ? simulated annealing using hybrid energy
function - Important difference of MD compared to gradient
minimization of a target function is the presence
of kinetic energy.
9Minimization by molecular dynamics
- MD solves Newtons equation of motion A
trajectory is obtained by numerical calculation
of the coordinates and velocities using small
time steps ?t. - MD can overcome local energy barriers using Ekin
- Temperature control and variation defines
protocol for minimization of the hybrid energy
function by simulated annealing.
E
E
E
x
x
x
High temperature
Low temperature
Energy landscape of protein
10Torsion angle dynamics (TAD)
- Newtons equation in generalized coordinates, ?1,
..., ?n
Cartesian coordinates
Quantity
Torsion angle space
Degrees of freedom
n torsion angles ?1, ..., ?n
3N coordinates x1, ..., xN
?
?
Lagrange equations dt(?? k L) ?? k L 0 L
Ekin Epot
Newtons equations mi i -?Epot
Equation of motion
.
?
x
? n3 (linear equations) ? n (tree structure)
Proportional to N
Computational complexity
Exploiting the tree structure of proteins, the
computational cost for TAD is proportional to the
system size.
11Simulated annealing protocol
- Structure calculation is started from a
conformation with all torsion angle - treated as independent, uniformly distributed
random variables - Short minimization 100 conjugated gradient steps
at target level 3 100 conjugated minimization
steps at target level ? - TAD at constant high temperature 1/5 of all
steps - TAD with slow cooling close to zero temperature
4/5 of all steps - Incorporation of all hydrogen atoms in check for
steric repulsion. 100 conjugated gradient
steps, followed by 100 TAD at T 0. - Final minimization consisting of 1000 conjugated
gradient steps.
Temperature T
Time steps ?t
Torsion angle changes ??
12Distance information from NOEs
- Conversion of NOE into distance information
- Isolated spin approximation
- NOEAB ? Ccal
dAB-6 - Calibration constant Ccal can be derived from
reference distances - Ccal
NOEref / dref6 - Reference distance can either be
a covalently fixed distance or - an average distance dref ?
?1/N?dk-6 ?1/6 - Treatment of NOE information during simulated
annealing - Use of upper distance bound, b, instead of fixed
distance - ENOE ?(dABstruct
b) (dABstruct b)2 - Lower bound is given by vdW repulsion
13Automated NMR structure determination
- Automated NOESY spectral analysis using RADAR
14Content
- Overview and motivation
- CANDID algorithm
- ATNOS algorithm
- Criteria to judge correctness of result
- Proof of principle
15Automated NOESY spectral analysis
Protein sequence Chemical shift list NOESY spectra
- Automated methods
- more efficient
- more exhaustive data evaluation
- more objective
- Iterative process
- all but the first cycle use the intermediate
structures from the preceding cycle - Correctness of cycle 1 is crucial for reliablity
of automated approach -
NOE identification
NOE assignment
Structure calculation
Assigned NOESY spectra 3D protein structure
16RADAR incorporates the analysis of the raw NMR
data into the process of automated NMR structure
determination.
17RADARRaw data analysis in NMR
RADAR incorporates and tightly merges the
functionalities of the original algorithms ATNOS
and CANDID. So far, ATNOS and CANDID are
implemented as subroutines of the DYANA torsion
angle dynamics algorithm. An autonomous version,
named RADAR, is in preparation.
- ATNOS for automated NOESY peak picking and NOE
signal identification - CANDID for automated NOE assignment
18CANDID Combined Automated NOESY Assignment and
Structure Determination Module
- NOE assignment problem
- Ambiguous distance constraints
- Network-anchored assignment
- Constraint combination
19NOE assignment problem
- Experimental incertainties in the determination
of chemical shifts and peak positions requires
the use of chemical shift tolerance windows
??tol. - ? multiple initial assignment possibilities
based on chemical shift agreement - ? only minority of peaks can be unambiguously
assigned sonely based on chemical shift agreement
- Primary selection criteria for NOE assignments
are chemical shift agreement and spatial
proximity in 3D structure.
?1 ?A ? ??tol ?2 ?B1 ? ??tol ?2 ?B2
? ??tol
20Chemical shift-based assignment
- 2D NOESY spectrum
- Peaks with 1 assignment N(1) N (1 - p)2n-2
? Nexp(-2np) - Peaks with 2 assignments N(2)
N2p(n-1)(1-p)2n-3 ? 2npN(1) -
N number of cross peaks -
n number of protons -
p 2? tol/?? probability of finding a 1H
chemical shift within -
?-? tol, ??tol, under
the assumption that shift are -
equally distributed over
spectral width ??.
N(1)
N(2)
N
N
0.01
0.02
ppm
0.01
0.02
ppm
213D structure-based assignment
- Assignment ambiguity can be resolved if all but
one initial assignment possibility correspond to
proton-proton distances larger than the maximal
distance dmax for which NOE may be observed. - Assuming that the protons are evenly distributed
in a spherical-shaped protein with radius R, the
probability that two randomly selected protons
are closer than dmax to each other is given by -
p (dmax/R)3 - Example dmax 5Å, R 15Å ? p 4
- 96 of peaks with 2 assignments can be
resolved by 3D structure - Unique assignments Nunique N(1) (1-p)N(2)
(1-p)2N(3) ... - ? Nunique
lt N
22Ambiguous distance constraints
- A NOESY cross peak with a single initial
assignment (n1) gives rise to a conventional
upper distance constraint. - A NOESY cross peak with initial assignments (ngt1)
gives rise to an ambiguous distance constraint.
deff ? ??dk6?1/6 ? b
b upper distance bound dk distance for
assignment possibility k Sums run over all
assignment possibilities
Nilges et al., 1997, J. Mol. Biol. 269, 408-422
23Motivation for ambiguous distance constraints
24Characteristics of ambiguous distance constraints
- An ADC corresponds to the sum of individual
contributions - NOE ?NOEi
- An ADC will not distort the structure as long as
the correct assignment is present among the
initial assignments - deff ?
(?dk-6)-1/6 ? b - BUT
- An ADC has reduced informational content compared
to conventional DC - ? reduce initial assignment possibilities
- An ADC can not reduce the effect of an artifact
DC - ? detect or at least reduce impact of
retained wrong ADC
25Ranking of assignment possibilities
- A volume contribution, Ci, is attributed to each
initial assignment of a peak - NOE ? CiNOE, 0 ?
Ci ? 1
- An initial assignment is retained only if Ci gt
Cmin.
Ci c Pics Ficov Fitrans Finetwork
Pi3D
- Pcs Chemical shift agreement
- Fcov Compatibility with covalent
polypeptide structure - Ftrans Presence of symmetry-related cross
peaks in 3D NOESY - Fnetwork Network-anchored assignment
- P3D Compatibility with intermediate 3D
structure (cycle gt 1) - c normalization constant, chosen such
that ?Ci 1
26Network-anchored assignment
- Network-anchoring exploits the fact that any
network of correct NOE peak assignments forms a
self-consistent set. - Each initial assignment is weighted by the extent
to which it can be embedded into the network
formed by all other NOE peak assignments. - Network-anchoring evaluates the self-consistency
of NOE assignments independent of knowledge on
the 3D structure, thus compensates for the
absence of 3D structural knowledge at the outset
of a de novo structure calculation (cycle 1).
27Network-anchored assignment
- Calculation of weighting factor, Finetwork, for
an initial assignment i that connects atoms A and
B
Triangular connections ABX Find NOE peaks with
an initial assignment AX or BX, where atom X is
maximally one residue apart from A or B.
FABnetwork ? CAX CBX where CAX and CBX are
the volume contributions for the assignment AX
and BX.
28Filtering of assignment possibilities
- Elimination of initial assignments by retaining
only assignment possibilities with a certain
volume contribution Ci gt Cmin. Minimal required
volume contribution, Cmin, is a function of the
cycles 1, ..,7. - Network-anchored assignment dramatically reduces
the assignment ambiguity at the outset of a
structure calculation (cycle 1). - Loss of conformational information, as intrinsic
feature of ADC, is reduced by network-anchoring.
Majority of peaks have at most 2-3 retained
initial assignments.
29Elimination of erroneous peaks
- Network-anchoring (cycle 1, 2, ..., 7)
- If none of the initial assignments can be
reliably embedded into the network of NOEs formed
by all other NOE peaks, then this peak is
considered as artifact and discarded from further
considerations. - ? Noise analysis can be started prior to any
structure calculation. - Compatibility with 3D structure (cycle 2, ... ,
7) - If an ambiguous distance constraints is
violated by more than a user-defined threshold,
dcut, in more than a user-defined percentage of
the conformers, then this peak is considered as
artifact.
30Constraint combination
- CC reduces the impact of artifact NOE upper
distance constraints by combining the assignments
for two or several peaks into a single upper
distance constraint.
NMR structure using 2 single constraints
NMR structure using CC
Native protein
D
D
C
D
C
C
A
B
B
A
B
A
31Constraint Combination
1 peak with assignment
1 ambiguous distance constraint
A1B1
A1B1
A2B2
A2B2
2 (unrelated) peaks
1 combined ambiguous distance constraint
A1B1
C1D1
A1B1
A2B2
A2B2
C2D2
C1D1
C2D2
32Effect of network-anchoring and constraint
combination
Constraint Combination
-
-
Network-anchoring
-
-
cycle 1
cycle 7
33De novo structure determinations using CANDID
- Proof of principle for the CANDID approach was
established by comparing the resulting protein
structures with those obtained by interactive
procedures. - The potential of CANDID is further supported by
in the meantime de novo structure determination
of about 20 proteins.
34ATNOSAutomated signal recognition for NOESY
spectra
- Overview and motivation
- New concepts for NOESY peak picking
- Results
35(No Transcript)
36New concepts for NOESY peak picking
- Covalent polypeptide structure is used to derive
spectrum-specific threshold parameters, e.g.,
signal-to-noise ratio - Multipass-filtering applied to different peak
classes using chemical shift data and
intermediate 3D structure
37Definition of covalent peaks
- Fixed bond lengths, bond angles and chiralities
of the covalent polypeptide structure of the
protein imposes NOE oberservable upper distance
limits on certain intraresidual and sequential
1H-1H distances. - for example intraresidual H?-HN distance
- dmin ? dH?HN ? dmax lt 5 Å
for all possible conformations - Covalent peak ? Local extremum with at least one
initial assignment to an atom pair AB with a
covalent structure-imposed maximal distance
dmaxAB smaller than 5 Å.
38Exploiting chemical shift knowledge
- Grid spanned by the frequency of all assigned
atoms is overlaid onto the NOESY spectrum. - Each grid point is considered as origin for local
search. - Potential NOEs are assigned and classified
- - covalent peaks and other peaks (cycle 1)
- - structural compatible peaks and all other
peaks (cycle 2, ..., 7)
Multipass-filtering
- Filtering based on
- Peak separation
- Chemical shift agreement
- Network-anchoring
- Symmetry of NOESY
- Compatibility with 3D structure
39Criteria for NOE validation using chemical shift
data
- Compatibility with intermediate structure
wA
Atom A
Atom B
wB
(w1,w2)
40ATNOS/CANDID cycles
cycle 1
cycle 2
cycle 7
reference
41Criteria used to jugde the correctness of the
resulting 3D protein structure
Derived from experience gained during software
development and applications of the algorithms
for de novo structure determinations.
- 2 Input requirements
- 3 Output criteria
42Input requirements for succesful application of
ATNOS/CANDID
- Requirement 1
- Completness of chemical shifts
- gt 90 of non-labile and backbone amide 1H (and
corresponding 13C/15N, in case of 3D NOESY)
- Requirement 2
- Correctness of Calibration
- Chemical shifts and NOE signal must be
self-consistent within tolerance window ??tol. - As reference NOE signals, all atom pairs are
considered with covalent structure-imposed
distance lt 5 Å. - e.g. intraresidual H?-HN contact
- gt 85 of covalent contacts must be found in the
NOESY
43Output criteria used to jugde the correctness of
resulting 3D protein structure
- Residual DYANA target function value
- ? TFcycle1 lt 200Å2, TFcycle7 lt 2Å2
- Root mean square deviation (RMSD) value
- ? RMSDcycle1 lt 3Å
- Evolution of RMSDdrift value
- ? The RMSD value between the mean coordinates
of the k-th and the last cycle should be in the
order of the RMSD value of the k-th cycle.
44Output criteria
- Target function
- TFcycle1 smaller than 200Å2
- TFcycle7 smaller than 2Å2
- RMSD
- RMSDcycle1 smaller than 3Å
- RMSDdrift
- RMSDdrift smaller than RMSD
45De novo structure determinations using
ATNOS/CANDID
- TM1816 (124 aa)
- TM1290 (116 aa)
- TM1492 (66 aa)
- En-2 (60 aa)
- En-1 (52 aa)
- GPYDs (159 aa)
- Mouse PrP (113 aa)
- Horse PrP (113 aa)
- Crotamine (42 aa)
46Conclusions
- ATNOS/CANDID enables direct feedback between the
protein structure, the NOE assignments and the
experimental NOESY spectra. - ATNOS/CANDID achieves the correct fold of the
protein already after the first cycle. - ATNOS/CANDID results in greatly enhanced
efficiency of de novo NMR structure
determination. - ATNOS/CANDID provides an objective tool for 3D
protein NMR structure determination.
47Acknowledgment
48THE END