First Aid - PowerPoint PPT Presentation

About This Presentation
Title:

First Aid

Description:

Anisotropy is easily detected and can be corrected' for. ... Anisotropy in diffraction data produces similar trend to Pseudo centering ... – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 38
Provided by: cci6
Learn more at: https://cci.lbl.gov
Category:
Tags: aid | anisotropy | first

less

Transcript and Presenter's Notes

Title: First Aid


1
First Aid PathologyData quality assessment in
PHENIX
  • Peter Zwart

2
Introduction
  • Structure solution can be enhanced by the
    knowledge of the quality and idiosyncrasies of
    the merged data
  • Anomalous signal?
  • Twinning
  • Pseudo centering
  • Data characterization should extend beyond
    standard quantities as Rmerge and nominal
    resolution
  • A full characterization of a data set might
    provide expert systems, such as wizards, useful
    information on how to most optimally solve a
    structure

3
Introduction
  • Xtriage is a program that aims to characterize a
    merged X-ray dataset
  • Probabilistic unit cell content analyses
  • Likelihood based Wilson scaling
  • Analyses of mean intensity
  • Ice ring detection
  • Outlier analyses
  • Twinning / pseudo centering
  • Anomalous signal

4
Likelihood based Wilson Scaling
  • Both Wilson B and nominal resolution determine
    the looks of the map
  • Zwart Lamzin (2003). Acta Cryst. D50,
    2104-2113.

Bwil 50 Å2 dmin 2Å
Bwil 9 Å2 dmin 2Å
5
Likelihood based Wilson Scaling
  • Data can be anisotropic
  • Traditional straight-line fitting not reliable
    at low resolution
  • Solution Likelihood based Wilson scaling
  • Results in estimate of anisotropic overall B
    value.
  • Zwart, Grosse-Kunstleve Adams, CCP4 newletter,
    2005.

6
Likelihood based Wilson Scaling
  • Likelihood based scaling not extremely sensitive
    to resolution cut-off, whereas classic straight
    line fitting is.

7
Likelihood based Wilson Scaling
  • Anisotropy is easily detected and can be
    corrected for.
  • Useful for molecular replacement and possibly for
    substructure solution
  • Anisotropy correction cleans up your N(Z) plots

8
Likelihood based Wilson Scaling
  • For the ML Wilson scaling an expected Wilson
    plot is needed
  • Obtained from over 2000 high quality experimental
    datasets
  • Expected intensity and its standard deviation
    can be obtained

9
Likelihood based Wilson Scaling
  • Resolution dependent problems can be
    easily/automatically spotted
  • Ice rings
  • Empirical Wilson plots available for protein and
    DNA/RNA.

10
Outlier analyses
  • Assume amplitudes are distributed according to
    Wilson distribution
  • For a dataset of a given size, the cumulative
    distribution function of the largest E values
    in the dataset can be used to detect outliers

11
Pseudo Translational Symmetry
  • Can cause problems in refinement and MR
  • Incorrect likelihood function due to effects of
    extra translational symmetry on intensity
  • Can be helpful during MR
  • Effective ASU is smaller is T-NCS info is used.
  • The presence of pseudo centering can be detected
    from an analyses of the Patterson map.
  • A Fobs Patterson with truncated resolution should
    reveal a significant off-origin peak.

12
Pseudo Translational Symmetry
  • A database analyses reveal that the height of the
    largest off-origin peaks in truncated X-ray data
    set are distributed according to

13
Pseudo Translational Symmetry
  • 1-F(Qmax) The probability that the largest off
    origin peak in your Patterson map is not due to
    translational NCS This is a so-called p value
  • If a significance level of 0.01 is set, all off
    origin Patterson vectors larger than 20 of the
    height of the origin are suspected T-NCS vectors.

PDBID Height () P-value ()
1sct 77 910-6
1ihr 45 110-3
1c8u 20 1
1ee2 10 5
14
Twinning
  • Merohedral twinning can occur when the lattice
    has a higher symmetry than the intensities.
  • When twinning does occur, the recorded
    intensities are the sum of two independent
    intensities.
  • Normal Wilson statistics break down
  • Detect twinning using intensity statistics

15
Twinning
  • Cumulative intensity distribution can be used to
    identify twinning
  • (acentric data)
  • Pseudo centering
  • Normal
  • Perfect twin

16
Twinning
  • Pseudo centering twinning N(Z) looks normal
  • Anisotropy in diffraction data produces similar
    trend to Pseudo centering
  • Anisotropy can however be removed
  • How to detect twinning in presence of T-NCS?
  • Partition miller indices on basis of detected
    T-NCS vectors
  • Intensities of subgroups follow normal Wilson
    statistics (approximately)
  • Use L-test for twin detection
  • Not very sensitive to T-NCS if partitioning of
    miller indices is done properly.
  • No need to know twin laws not sensitive to
    pseudo symmetry or certain data processing
    problems.

17
Twinning
18
Twinning
  • A data base analyses on high quality, untwinned
    datasets reveals that the values of the first and
    second moment of L follow a narrow distribution
  • This distribution can be used to determine a
    multivariate Z-score
  • Large values indicate twinning

19
Twinning
  • Determination of twin laws
  • From first principles
  • No twin law will be overlooked
  • PDB analyses 36 of structures has at least 1
    possible twin law
  • 50.9 merohedral 48.2 pseudo merohedral0.9
    both
  • 27 of cases with twin laws has intensity
    statistics that warrant further investigation on
    whether or not the data is twinned
  • 10 of whole PDB(!)
  • Determination of twin fraction
  • Fully automated Britton and H analyses as well as
    ML estimate of twin fraction of basis of L
    statistic.

20
Conflicting information
  • PDBID 1???
  • Unit cell 99.5 60.9 70.96 90 134.5 90
  • Space group C 2
  • Twin laws and estimated twin fractions
  • H,-K,-H-L 0.44
  • H2L,-K,-L 0.01
  • -H-2L, K, HL 0.01
  • ltI2gt/ltIgt2 2.10 (theory for untwinned data
    2.0)
  • Data does not appear to be twinned
  • ltLgt 0.49 (theory for untwinned data 0.5)
  • Multivariate Z-score of L test 0.963
  • Data does not appear to be twinned

21
Conflicting information
  • What is going on?
  • Estimated twin fraction is large, but data does
    not seem to be twinned
  • Twin law H,-K,-H-L is parallel to an existing NCS
    axis
  • or
  • Twin law H,-K,-H-L is a symmetry axis, and the
    space group is too low
  • It should be C2 H,-K,-H-L F222
  • http//www.phenix-online.org/cctbx
  • Need images to make decision

22
Conflicting information
  • A DNA example
  • Space group P65
  • 1 twin law
  • Resolution 1.87A
  • Native Patterson analyses indicates several
    significant off-origin peaks
  • Intensity statistics indicate pseudo translation
    symmetry
  • ltI2gt/ltIgt2 4.243
  • N(Z) plot not very informative

23
Conflicting information
  • However
  • L test ltLgt0.46
  • Data might be twinned.
  • Partitioned data might not follow Wilson
    statistics however.
  • Britton and H analyses estimate of twin fraction
    is about 40
  • Wrong spacegroup?
  • Monomer would not fit in ASU
  • Twinning, pseudo symmetry, or both?
  • Not clear from experimental data only, use
    deposited coordinates
  • Rwork28 Rfree34
  • Twin fractions via Britton plot
  • From Fcalc 11 (due to pseudo symmetry only)
  • From Fobs 41 (pseudo symmetry twinning)
  • See Lebedev, Vagin, Murshudov (2006) Acta D62,
    83-95.
  • Data likely to be twinned.
  • Difficult to spot due to TPS and RPS effects on
    intensity statistics

24
Anomalous data
  • Structure solution via experimental methods
    (especially SAD) is on the rise.
  • Presence of anomalous signal indicated by a
    quantity called Measurability
  • Fraction of Bijvoet differences for which
  • DI/sDIgt3 and (I/sI() and I(-)/sI(-) gt 3)
  • Easy to interpret
  • At 3 Angstrom 6 of Bijvoet pairs are
    significantly larger than zero

25
Anomalous data
  • Measurability and ltDI/sDIgt are closely related
  • Measurability more directly translates to the
    number of useful Bijvoet differences in
    substructure solution/phasing

26
Anomalous data
6 (partially occupied) Iodines in thaumatin at
l1.5Å.
Raw SAD phases, straight after PHASER
A
A B
B
27
Anomalous data
6 (partially occupied) Iodines in thaumatin at
l1.5Å.
Density modified phases
A
A B
B
28
Anomalous data
  • SAD phasing with PHASER
  • Very sensitive residual maps
  • Residual map indicates where a certain type of
    anomalous scatterers need to be placed to improve
    fit between observed and expected F() and F(-)
  • Lysozyme soaked with solution containing
    (NH4)2(OsCl6)
  • Wilson B 13.7 dmin1.7
  • Data collected at Os L-III edge (fgt10)
  • Measurability at 3.0 is 67
  • Anomalous signal is strong
  • Partial structure is large
  • Zheavy2/(Zheavy2Zprotein2)35

PHASER residual map indicating location of main
chain atoms
29
Anomalous data
  • SAD phasing with PHASER
  • Very sensitive residual maps
  • Residual map indicates where a certain type of
    anomalous scatterers need to be placed to improve
    fit between observed and expected F() and F(-)
  • Lysozyme soaked with solution containing
    (NH4)2(OsCl6)
  • Wilson B 13.7 dmin1.7
  • Data collected at Os L-III edge (fgt10)
  • Measurability at 3.0 is 67
  • Anomalous signal is strong
  • Partial structure is large
  • Zheavy2/(Zheavy2Zprotein2)35

Raw PHASER SAD phases
30
Anomalous data
  • Another extreme
  • 2 Fe4S4 clusters in 60 residues
  • Wilson B 6.5Å2 dmin1.2Å
  • Measurability at 3.0Å 6
  • Data not terribly strong
  • ZFe2/(ZFe2ZS2Zprotein2)17
  • Fe f 1.25 e S f 0.35 e
  • PHASER residual map from Fe SAD phases clearly
    show S positions

SAD on Fe, residual maps indicate S positions
(green balls)
31
Anomalous data
  • Inclusion of Sulfurs improves phasing
  • (ZFe2ZS2)/(ZFe2ZS2Zprotein2)32
  • ltFOMgt0.67 (was 0.53)
  • Residual maps show almost all non-hydrogen atoms
  • Inclusion of non hydrogen atoms results in
    ltFOMgt0.98.

SAD on Fe, S. Residual maps (purple) and FOM
weighted Fobs map (blue).
32
Discussion Conclusions
  • Software tools are available to point out
    specific problems
  • mmtbx.xtriage ltinput_reflection_filegt params
  • Log file are not just numbers, but also contains
    an extensive interpretation of the statistics
  • Knowing the idiosyncrasies of your X-ray data
    might avoid falling in certain pitfalls.
  • Undetected twinning for instance

33
First Aid
  • Analyses at the beamline
  • If problem are detected while at the beam line,
    possible problems could be solved by recollecting
    data or adapting the data collection strategy.

The Surgeon and the Peasant 1524. Lucas van
Leyden
34
Pathology/Autopsy
  • Analyses at home

The anatomical lesson of dr. Nicolaes Tulp -
1632. Rembrandt van Rijn.
35
Ackowledgements
Cambridge Randy Read Airlie McCoy Laurent
Storoni Los Alamos Tom Terwilliger Li Wei
Hung Thirumugan Rhadakanan Texas AM
Univeristy Jim Sacchettini Tom Ioerger Eric
McKee
  • Paul Adams
  • Ralf Grosse-Kunstleve
  • Pavel Afonine
  • Nigel Moriarty
  • Nick Sauter
  • Michael Hohn

36
W W W
  • Phenix
  • www.phenix-online.org
  • Xtriage tutorials
  • www.phenix-online.org/tutorials
  • CCTBX
  • cctbx.sf.net

37
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com