Assessment and Validation Tools - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Assessment and Validation Tools

Description:

Assessment and Validation Tools – PowerPoint PPT presentation

Number of Views:299
Avg rating:3.0/5.0
Slides: 49
Provided by: johnma93
Category:

less

Transcript and Presenter's Notes

Title: Assessment and Validation Tools


1
Assessment and Validation Tools for NMR
Structure Determinations Thanks to Chris
Spronk Jurgen Doreleijers Guy Montelione (for
figures and results)
2
References
R. A. Laskowski, J. A. Rullmannn, M. W.
MacArthur, R. Kaptein, J. M. Thornton, AQUA and
PROCHECK-NMR programs for checking the quality
of protein structures solved by NMR J Biomol NMR
8, 477-486 (1996). S. B. Nabuurs, C. A. E. M.
Spronk, G. Vriend, and G. W. Vuister, Concepts
and Tools for NMR Restraint Analysis and
Validation Concepts in Magnetic Resonance 22A,
90-105 (2004). J. F. Doreleijers, A. J.
Nederveen, W. Vranken, J. Lin, A. M. J. J.
Bonvin, R. Kaptein, J. L. Markley, and E. L.
Ulrich, BioMagResBank databases DOCR and FRED
with converted and filtered sets of experimental
NMR restraints and coordinates from over 500
protein PDB structures, J. Biomol. NMR 32, 1-12
(2005). A. J. Nederveen, J.F. Doreleijers,W.
Vranken, Z. Miller, C. A. E. M. Spronk, S. B.
Nabuurs, P. Güntert, M. Livny, J. L. Markley, M.
Nilges, E. L. Ulrich, R. Kaptein, and A. M. J. J.
Bonvin, RECOORD a REcalculated COORdinates
Database of 500 proteins generated from
restraint data downloaded from the
BioMagResBank, Proteins 59, 662-272 (2005). L.
Wang, H. R. Eghbalnia, A. Bahrami, and J. L.
Markley, Linear analysis of carbon-13 chemical
shift differences and its application to the
detection and correction of errors in referencing
and spin system identifications, J. Biomol. NMR
32, 13-22 (2005).
3
References continued
Zhang, H., Neal, S. Wishart, D. S. (2003)
RefDB a database of uniformly referenced protein
chemical shifts, Journal of Biomolecular NMR. 25,
173-195. Nabuurs, S. B., Spronk, C. A., Krieger,
E., Maassen, H., Vriend, G. Vuister, G. W.
(2003) Quantitative evaluation of experimental
NMR restraints, J Am.Chem.Soc. 125,
12026-12034. Moseley, H. N., Sahota, G.
Montelione, G. T. (2004) Assignment validation
software suite for the evaluation and
presentation of protein resonance assignment
data, J Biomol NMR. 28, 341-55. Nabuurs, S. B.,
Krieger, E., Spronk, C. A., Nederveen, A. J.,
Vriend, G. Vuister, G. W. (2005) Definition of
a new information-based per-residue quality
parameter, J Biomol NMR. 33, 123-34. Nabuurs, S.
B., Spronk, C. A., Vuister, G. W. Vriend, G.
(2006) Traditional biomolecular structure
determination by NMR spectroscopy allows for
major errors, PLoS Comput Biol. 2, e9.
Ginzinger, S. W., Gerick, F., Coles, M. Heun,
V. (2007) CheckShift automatic correction of
inconsistent chemical shift referencing, J Biomol
NMR. 39, 223-7. Vranken, W. (2007) A global
analysis of NMR distance constraints from the
PDB, J Biomol NMR. 39, 303-14. Bhattacharya, A.,
Tejero, R. Montelione, G. T. (2007) Evaluating
protein structures determined by structural
genomics consortia, Proteins. 66, 778-95. CING
(pronounced king) stands the Common Interface
for NMR structure Generation http//nmr.cmbi.ru.n
l/cing/Home.html
4
Importance of structure validation
  • Means for determining the precision and accuracy
    of NMR structures
  • Benchmark for comparing different methods for
    structure determination
  • Needed for community-wide assessment of the
    validity of NMR structures
  • Standard by which improvements in technology can
    be gauged
  • Structures should be reliable be consistent with
    experimental data, have good local and overall
    quality

5
Precision vs. accuracy (often confused in the
literature)
  • Precision is the variation of X around ltXgt,
    expressed as standard deviation or variance
  • Accuracy is the closeness of ltXgt to the true
    value of X
  • Accuracy can only be measured relative to a gold
    standard (e.g. by reconstructing a known result
    with simulated data)

Adapted from Chris Spronk
6
Precision vs. true variance
Precision underestimates true variance
Precision equals true variance
Precision overestimates true variance
Adapted from Chris Spronk
7
Limitations of the biomolecular NMR field
  • No standard convention for estimating precision
  • No standard convention for estimating accuracy
  • No standard convention for estimating true
    variance
  • Lack of objective reproducibility of manual data
    analysis steps
  • Recognition of these problems is coming to the
    forefront
  • Position paper on validation disseminated at a
    meeting in Florence in January 2007
  • Validation was the major issue addressed by the
    Worldwide Protein Databank (wwPDB) NMR Task Group
    (at the ISMAR meeting in Taiwan 10/2007 )

8
Approaches to assessing accuracy and their
limitations
  • Restraint violations
  • Restraints are interpreted data
  • No standard for calibrating constraints
  • Restraints per residue
  • Conformationally-restraining
  • Restraints per restrained residue
  • How to define restrained residues?
  • ProCheck / MAGE
  • Parameters derived from crystal structures
  • Question of which residues to include/exclude
  • Cross validation with RDC
  • Not measured universally
  • Not sensitive to rigid body translation multiple
    alignments

Adapted from Guy Montelione
9
Approaches to assessing accuracy and their
limitations
  • Comparison with crystal structures
  • Differences with x-ray structure may be
    biologically relevant
  • Comparisons with solid state NMR data may be
    better, but still could reflect real differences
  • Back calculation of NOEs - relaxation matrix
    analysis
  • Compare to NOESY peak list?
  • Compare to NOESY spectrum? (what does this mean?)
  • Exchange broadening, lineshape, differential
    relaxation effects
  • Diagonal, ridges, overlap, residual water,
    saturation transfer
  • Differential relaxation of heteroatoms
  • Back calculation of chemical shifts
  • Promising used more and more
  • H-bond geometry
  • Interesting, but not comprehensive

10
Analysis of 151 pairs of NMR and crystal
structures
  • NMR overestimates precision of the ensemble
  • NMR provides inaccurate global structure
  • - Ensemble averaging
  • - Just plain wrong
  • Xray is inaccurate
  • Crystallization shifts global conformational
    equilibria

Line - rmsd of superimposed NMR ensemble
PRECISION Shade - rmsd between median NMR
conformer and Xtal structure ACCURACY
Filtered to be in same ligand state, similar
pH Analysis for FindCore core (bb and sc) atoms
only
Andrec, Snyder, Montelione, Levy, et al. (2007)
Proteins 69449
11
Least biased representation of carbon chemical
shifts, irrespective of structure, is as the sum
of three Gaussian distributions
Data for all alanine residues in RefDB
Occurrences separately as a function of 13Ca and
d13Cb
Occurrences as a function of d13Ca d13Cb
12
Linear Analysis of Chemical Shifts (LACS) plot
Data for valine from RefDB
L. Wang et al. (2005) J. Biomol. NMR, 3213-22
13
LACS data for a particular protein assigned 13C?
and 13C? chemical shifts from BMRB
This intercept should be at (0,0) for properly
referenced data
L. Wang et al. (2005) J. Biomol. NMR, 3213-22
14
We have used LACS to re-reference the BMRB
database
  • 11 ( 1.0 ppm )
  • 26 ( 0.5 ppm )
  • 46 ( 0.3 ppm )

L. Wang et al. (2005) J. Biomol. NMR, 3213-22
15
NMR structure determination
NMR experimental data
Structure ensemble
Experimental restraints
Structure calculation and selection
Assignment and conversion
Constraint violation and error analysis
Validated structure data
Structure quality checks and statistics
(often not done!)
Adapted from Chris Spronk
16
Distance restraints
(A) Ensemble of 30 structural models of GB1). The
a-helix is shown as a blue ribbon, the -sheets
are indicated with red ribbons. Hydrogen atoms
have been omitted for clarity. (B) Restrained
minimized average structure of GB1, with the 659
experimental distance restraints in the
experimental dataset shown in yellow. Restraints
involving groups of hydrogen atoms are, for
clarity reasons, only shown for one of the
protons involved. Figure made using YASARA
(http//www.yasara.org). (From Nabuurs et al.,
2004)
17
Sources of restraints (constraints)
NOE values J-couplings (Karplus eq.) Residual
dipolar couplings H-bonds experimental (trans
HB coupling) or inferred Relaxation probes
(relaxation or pseudocontact shifts) Chemical
shifts Biochemical information crosslinking,
....
18
Ambiguously determined H-bonds in an a-helix from
NOEs
Three contributing distances are shown in yellow,
allowing for the formation of either the i, i3,
the i, i4, or the i, i5 hydrogen bond. In this
case, the distance would be restrained to 2 Ã….
(From Nabuurs et al., 2004)
19
Classification of restraints
  • Intra-residue
  • Information on side chain conformation
  • Sequential residue i to residue i1
  • Information on secondary structure
  • Medium range residue i to residue i4
  • Information on secondary structure
  • Long range residue i to residue i5 and higher
  • Information on secondary and tertiary structure
  • Inter chain between subunits
  • Information on quaternary structure

20
Redundancy of restraints
  • Redundant restraints shouldnt be counted because
    they dont add information to the structure
  • E.g. HN-HA distance of 3.5 Ã…

21
Restraints and NOE completeness per residue
  • NOE completeness ( expected / observed) 100
  • on per-residue basis
  • Restraints per residue (useful for identifying
    regions with possible problems)

22
Common selection criteria for NMR structures
  • Violations cutoff
  • No distance restraint violations gt 0.5 Ã…
  • No dihedral angle violations gt 5
  • Energy
  • Select a sub-ensemble consisting of the lowest
    energy structures

23
Examples of selected conformers
rmsd3.04
rmsd0.82
rmsd0.77
energy cutoff
violations cutoff
24
An example of structure statistics
25
Protein structure properties used for validation
  • Bond lengths, bond angles, chirality, omega
    angles, side chain planarity
  • Ramachandran plot, rotameric states, packing
    quality, backbone conformation
  • Inter-atomic bumps, buried hydrogen-bonds,
    electrostatics

Adapted from Chris Spronk
26
Bonded geometry
Distorted C?-chirality
L-amino acid
D-amino acid
27
Rotameric states
Eclipsed
Staggered
28
Inter-atomic bumps
Overlap of two backbone atoms
29
Omega angles
Trans-configuration (omega180)
Cis-configuration (omega0)
30
Side chain planarity
Planar Arg side-chain (good)
Non-planar Arg side-chain (bad)
31
Internal hydrogen bonding should be satisfied
Internal hydrogen bonding in crambin
32
Electrostatics should be reasonable
After energy minimization including electrostatics
Bad electrostatics
33
Packing quality
Bad packing
Good packing
34
Backbone conformation
Normal
Unique
35
Backbone angles should lie in favorable regions
of the Ramachandran plot
Phi and psi angles
Ramachandran plot
36
Example of decrease in number of violations
following refinement in explicit solvent
(From Nabuurs et al., 2004)
37
Examples of tools available for assessing
structural quality
  • AQUA and PROCHECK NMR
  • Laskowski et al. (1996) J Biomol NMR 8477
  • Useful graphical and text output
  • WHAT IF
  • http//swift.cmbi.ru.nl/whatif/
  • More checks and more critical checks
  • QUEEN
  • Nabuurs et al. (2003) J Am Chem Soc 12512026
  • Check of input constraints
  • PSVS
  • Bhattacharya et al. (2007) Proteins 66778
  • Bundles several tools
  • Provides an extensive report

38
Information content of distance constraints from
QUEEN
  • QUantitative Evaluation of Experimental NMR
    constraints QUEEN
  • Method for evaluating distance constraints from
    distance matrices
  • Quantifies information contained in distance
    constraints
  • Identifies the relative contribution of each
    constraint to the structure determination
  • QUEEN identifies
  • Important restraints
  • Unique restraints
  • Redundant restraints

39
Example of a WHAT IF summary report
40
Protein Structure Validation Software
(PSVS)Bhattacharya, Tejero, Montelione (2007)
Proteins 66778
41
Protein structure validation software suite (PSVS)
Bhattacharya, Tejero, Montelione (2007) Proteins
66778
42
Poorly defined regions are excluded from analysis
Bhattacharya, Tejero, Montelione, Proteins
(2007)
43
Example of PSVS report
Bhattacharya, Tejero, Montelione, Proteins
(2007)
44
Correlation between ProCheck and MolProbity Z
scores
45
ProCheck and MolProbity Z scores
Following NMR Structure Refinement
X-ray
NMR
  • Why NMR different from X-ray?
  • Solution structure
  • Multiple conformational states?
  • Less accurate structures?

Bhattacharya, Tejero, Montelione, Proteins
(2007)
46
RPF quality scores
3D Structure
NOESY Peak List / Assignment List
Global and Local measures of the fit of NOESY
peak list data with 3D structure.
Violations map to the 3D structure and to the
NOESY spectrum
Essentially, a comparison of calc and observed
contact maps
Huang, Powers, Montelione (2005) J. Am. Chem.
Soc. 127 1665
47
Validation at the wwPDB
  • PDB
  • Completeness
  • Check of coordinates
  • Nomenclature, ligands
  • Accept restraints, but pass them directly to BMRB
  • BMRB
  • Completeness
  • Nomenclature and self consistency
  • Chemical shift ranges (AVS from Montelione)
  • Chemical shift referencing
  • Consistency of restraints and structure

48
Summary and prospects
  • Much of the work in developing approaches to
    validating NMR structures has taken place in
    Europe Tools and are now available that can avoid
    problems if used intelligently
  • Additional approaches are on the horizon
  • Authors should be encouraged to validate their
    constraints and structures prior to data
    deposition in the wwPDB
  • Centralized servers could facilitate this
  • Authors are strongly encouraged to deposit
    restraints, peak lists, NOESY spectra, and raw
    data (including time-domain data) to BMRB so that
    structures can be checked by others and
    recalculated as improved methods become available
Write a Comment
User Comments (0)
About PowerShow.com