Title: Solving NMR structures II: Calculation and evaluation
1Solving NMR structures IICalculation and
evaluation
- The NMR ensemble
- Methods for calculating structures
- distance geometry, restrained molecular
dynamics, simulated annealing - Evaluating the quality of NMR structures
- resolution, stereochemical quality, restraint
violations, etc
2Calculating NMR structures
- so weve talked some about getting qualitative
structural information from NMR, for instance
certain secondary structures have characteristic
nOes and J-couplings associated with them - weve also talked about the concept of explicit
distance or dihedral angle or hydrogen bond
restraints from nOe and J-coupling data etc. - how might we use such restraints to actually
calculate a detailed, quantitative
three-dimensional structure at a high level of
accuracy and precision?
3In NMR we dont get a single structure
- the very first thing to recognize is that our
input restraints do not uniquely define a
structure at infinitely high precision
(resolution) and accuracy--we can never have
enough restraints, determined at high enough
accuracy and precision, to do that! - rather, a set of many closely related structures
will be compatible with these restraints--how
closely related these compatible structures are
will depend on how good/complete our data are! - the goal of NMR structure determination is
therefore to produce a group of possible
structures which is a fair representation of this
compatible set.
4The NMR Ensemble
- repeat the structure calculation many times to
generate an ensemble of structures consistent
w/restraints - ideally, the ensemble is representative of the
permissible structures--the RMSD between ensemble
members accurately reflects the extent of
structural variation permitted by the restraints
ensemble of 25 structures for Syrian hamster
prion protein
Liu et al. Biochemistry (1999) 38, 5362.
5Random initial structures
- to get the most unbiased, representative
ensemble, it is wise to start the calculations
from a set of randomly generated starting
structures
6Calculating the structures--methods
- distance geometry (DG)
- restrained molecular dynamics (rMD)
- simulated annealing (SA)
- hybrid methods
7DG--Distance geometry
- In distance geometry, one uses the nOe-derived
distance restraints to generate a distance
matrix, from which one then calculates a
structure - Structures calculated from distance geometry will
produce the correct overall fold but usually have
poor local geometry (e.g. improper bond angles,
distances) - hence distance geometry must be combined with
some extensive energy minimization method to
generate good structures
8rMD--Restrained molecular dynamics
- Molecular dynamics involves computing the
potential energy V with respect to the atomic
coordinates. Usually this is defined as the sum
of a number of terms - Vtotal Vbond Vangle Vdihedr VvdW Vcoulomb
VNMR - the first five terms here are real energy terms
corresponding to such forces as van der Waals and
electrostatic repulsions and attractions, cost of
deforming bond lengths and angles...these come
from some standard molecular force field like
CHARMM or AMBER - the NMR restraints are incorporated into the VNMR
term, which is a pseudoenergy or
pseudopotential term included to represent the
cost of violating the restraints
9Pseudo-energy potentials for rMD
- Generate fake energy potentials representing the
cost of violating the distance or angle
restraints. Heres an example of a distance
restraint potential
KNOE(rij-riju)2 if rijgtriju
0 if rijlltrij lt riju
VNOE
KNOE(rij-rij1)2 if rijltrijl
where rijl and riju are the lower and upper
bounds of our distance restraint, and KNOE is
some chosen force constant, typically 250 kcal
mol-1 nm-2 So its somewhat permissible to
violate restraints but it raises V
10SA-Simulated annealing
- SA is very similar to rMD and uses similar
potentials but employs raising the temperature of
the system and then slow cooling in order not to
get trapped in local energy minima - SA is very efficient at locating the global
minimum of the target function
11Ambiguous restraints
- often not possible to tell which atoms are
involved in a NOESY crosspeak, either because of
a lack of stereospecific assignments or because
multiple protons have the same chemical shift - possible to resolve many of these ambiguities
iteratively during the calculation process - can generate an initial ensemble with only
unambiguous restraints, and then use this
ensemble to resolve ambiguities--e.g., if two
atoms are never closer than say 9 Ã… in any
ensemble structure, one can rule out an nOe
between them - can also make stereospecific assignments
iteratively using what are called floating
chirality methods - there are now automatic routines for iterative
assignment such as the program ARIA.
12Criteria for accepting structures
- typical to generate 50 or more structures, but
not all will converge to a final structure
consistent with the restraints - therefore one uses acceptance criteria for
including calculated structures in the ensemble,
such as - no more than 1 nOe distance restraint violation
greater than 0.4 Ã… - no dihedral angle restraint violations greater
than 5 - no gross violations of reasonable molecular
geometry - sometimes structures are rejected on other
grounds as well, such as having multiple residues
with backbone angles in disallowed regions of
Ramachandran space or simply having high
potential energy in rMD simulations
13Precision of NMR Structures (Resolution)
- judged by RMSD of ensemble of accepted structures
- RMSDs for both backbone (Ca, N, CCO) and all
heavy atoms (i.e. everything except hydrogen) are
typically reported, e.g. - bb 0.6 Ã…
- heavy 1.4 Ã…
- sometimes only the more ordered regions are
included in the reported RMSD, e.g. for a 58
residue protein you will see RMSD (residues 5-58)
if residues 1-4 are completely disordered.
14Reporting RMSD
- two major ways of calculating RMSD of the
ensemble - pairwise compute RMSDs for all possible pairs of
structures in the ensemble, and calculate the
mean of these RMSDs - from mean calculate a mean structure from the
ensemble and measure RMSD of each ensemble
structure from it, then calculate the mean of
these RMSDs - pairwise will generally give a slightly higher
number, so be aware that these two ways of
reporting RMSD are not completely equal. Usually
the Materials and Methods, or a footnote
somewhere in the paper, will indicate which is
being used.
15Minimized average
- a minimized average is just that a mean
structure is calculated from the ensemble and
then subjected to energy minimization to restore
reasonable geometry, which is often lost in the
calculation of a mean - this is NMRs way of generating a single
representative structure from the data. It is
much easier to visualize structural features from
a minimized average than from the ensemble. - for highly disordered regions a minimized average
will not be informative and may even be
misleading--such regions are sometimes left out
of the minimized average - sometimes when an NMR structure is deposited in
the PDB, there will be separate entries for both
the ensemble and the minimized average. It is
nice when people do this. Alternatively, a
member of the ensemble may be identified which is
considered the most representative (often the one
closest to the mean).
16What do we need to get a high-resolution NMR
structure?
- usually 15-20 nOe distance restraints per
residue, but the total is not as important as
how many long-range restraints you have, meaning
long-range in the sequence i-jgt 5, where i and
j are the two residues involved - good NMR structures usually have 3.5
long-range distance restraints per residue in the
structured regions - to get a very good quality structure, it is
usually also necessary to have some
stereospecific assignments, e.g. b hydrogens
Leu, Val methyls
17Assessing Structure Quality
- NMR spectroscopists usually run their ensemble
through the program PROCHECK-NMR to assess its
quality - high-resolution structure will have backbone RMSD
0.8 Ã…, heavy atom RMSD 1.5 Ã… - low RMS deviation from restraints
- will have good stereochemical quality
- ideally gt90 of residues in core (most favorable)
regions of Ramachandran plot - very few unusual side chain angles and rotamers
(as judged by those commonly found in crystal
structures) - low deviations from idealized covalent geometry
18Structural Statistics Tables
list of restraints, and type
calculated energies
agreement of ensemble structures with restraints
(RMS)
precision of structure (RMSD)
sometimes also see listings of Ramachandran
statistics, deviations from ideal covalent
geometry, etc.