Title: From Papertape Input to
1From Papertape Input to Forensic
CrystallographyA History of the Program PLATON
- Ton Spek,
- Bijvoet Center
- Utrecht University
- The Netherlands
- K.N.Trueblood Award Lecture
- Chicago, July 29, 2010.
2Some History
- Back in 1966 I started crystallography as a
student in the Laboratory for Crystal and
Structural Chemistry at Utrecht University that
was at that time headed by Prof. A.F. Peerdeman. - Peerdeman (co-author of the famous Bijvoet,
Peerdeman van Bommel paper on absolute
configuration) was the successor of Prof.
J.M.Bijvoet. - Dorothy Hodgkin came over during that time to
tell about the Vitamin B12 structure and her
oversees collaboration with Ken Trueblood
3After WWII, Bijvoet had Managed to start a new
lab in a stately house (used by the Gestapo
during WWII) close to the centre of the city of
Utrecht. Part of the house was his private
domain.
After his retirement, he still kept a
pied-a-terre for when he was in Utrecht. As a
student, I shared the family bedroom in its
double function as student room.
4Former Crystal Palace and home of Prof. J.M.
Bijvoet
5Computing-I
- The Crystal Palace was the home of the first two
generations of computing platforms within the
university of Utrecht (Zebra and X1
respectively). - In 1966 computing had moved to a University
computing centre elsewhere in the city. - Computing was done from then on with an Algol
language specific X8 computer (From a Dutch
company Electrologica, later part of Philips) - Processing was essentially one job at a time.
616kW
1966, Electrologica X8 ALGOL60 Mainframe
(lt1MHz)
7Computing-II
- Jobs were run by an operator during daytime
shifts - Most of our crystallographic work was done during
the once-a-week 13 hour nightshift when we as
crystallographers had the computer for ourselves.
Half of the staff stayed overnight. - We were during that nightshift the scientist, the
software developer and the system operator in
one. - I/O was paper tape based. One job at a time. Very
little memory. No stored binaries, thus
recompilation everytime.
8Computing-III
- Programs and data were on paper tape
- The preparation of programs and program input
were done on the so called Flexowriter. This very
noisy electical typewriter was also often used as
output medium. - Editing was done with a pair of sissors to cut
out unwanted material from the source code and
adhesive tape to glue a substitute in the paper
tape.
9Flexowriter for the creation and editing of
programs and input data
10The Science
- My supervisor, Dr. J.A. Kanters, gave me an
interesting assignment to work on. - He handed me a batch of white crystals with
unknown composition (code named M200). - The assignment was to find out what was the
structure, using single crystal X-ray techniques
only.
11Data Collection for M200
- Preliminary investigations done with film data
pointed at space group P-1. - A Patterson synthesis based on integrated
Weissenberg projection data subsequently
suggested a light atom structure. - Eventually a three-dimensional data set was
collected with an Enraf-Nonius AD3 diffractometer
- (two weeks of datacollection !).
12Nonius AD3 Diffractometer
13Structure Determination of M200
- It took half a year to finally find the
structure. - The laboratory had a tradition in Direct Methods
(Beurskens, de Vries, Kroon, Krabbendam) - However, all available software failed to solve
my structure (these were pre-MULTAN days ..) - In the end I had to write my own Direct Methods
program (AUDICE) that solved the triclinic
structure including many other unsolved
structures that were hanging around in the lab.
14The Structure
3-Methoxy-glutaconic acid
15The Program
16The Program AUDICE
- AUDICE was one of the Symbolic Addition programs
that were developed in that period. - Its specialty was that at the start of the
evaluation of the strong triple product
indications for a positive sign, 27 symbols were
introduced for strong starting reflections rather
than in the order of three by some other
approaches. Eventually, 8 solutions were produced
by eliminating 24 symbols based on multiple
indications. - In addition the correlation method was used to
improve the reliability of triple phase relations.
17The Correlation Method P for triple H,K,HK
depends on E(H)E(K)E(HK) Correlation
Method ? Improved P on the basis of P of
three adjacent triples E(H)E(L)E(HL)
E(K)E(L-K)E(L) E(HK)E(L-K)E(HL) I.e
. Strengthening of P(E(H)E(K)E(HK) when in
addition E(HL),E(L-K),E(L) strong (Note
Theoretically formalized in terms of
neighbourhoods, Hauptman)
L
H
K
HK
18Epilogue
- The structure of M200 has been published
- Unfortunately, attempts to publish AUDICE in Acta
Cryst. stranded on the referee requirement to
compare its performance on non ALGOL (real ..)
platforms. - Anyway AUDICE was superseded by the program
MULTAN (Fortran) on the new CDC University
Mainframe. - The structure solves and refines in a matter of
seconds on current hardware with SYSTEM S gt
19Automatic Structure Solution of M200 in the
No-Questions-Asked Mode
20Direct Methods Meetings
- Multiple meetings and schools were organized in
the 70s with Direct Methods (software and
theory) as its major subject. - Examples are the NATO schools in Parma and York,
the schools in Erice (1974 1978) and the
meetings at the Medical Foundation (Buffalo)
where I met Ken Trueblood. - Important ones werealso the CECAM workshops on
Direct Methods (5 weeks!, bringing together
people working in the field to work on current
issues) in the early 70s in Orsay (near Paris)
around a big IBM-360 with lectures by Hauptman.
(Participants Germain, Main, Destro, Viterbo).
The program MULTAN was finalized there. - Photo of the participants of the Parma 1973
meeting and the 1978 Erice School next
21Hauptman Lectures Parma Spring 1973
22(No Transcript)
23The National Facility
- In 1971, a national single crystal service
facility was started, with me to make it all
happen.. - I kept that position for 38 until my emeritus
status in 2009. - The project is now continued by my former
co-worker Martin Lutz - My last postdoc was Maxime Siegler, now staff
crystallographer at the John Hopkins University. - The program PLATON is a side product of the
national facility (note never explicitly funded
!)
24PLATON
- Work on PLATON started in 1980.
- The idea was to produce with a single CALC ALL
instruction an exhaustive listing of derived
geometry to give to our clients. - Over time numerous additional tools have been
added on the basis or the needs in our service
setting. - PLATON is, in combination with SHELX, one of the
major tools for our service.
25PLATON Tools
- The available tools are shown as clickable
options on the opening window of the program. - Examples are ADDSYM for the detection of missed
symmetry, TwinRotMat for automatic twinning
detection and SYSTEM S for guided/automated
structure determination) - Here we will look in some detail at a few of the
tools - SQUEEZE for the handling of disordered solvents
- Structure Validation (used as part of the IUCr
CheckCIF) - FLIPPER, a new approach to structure determination
26(No Transcript)
27The Disordered Solvent Problem
- Molecules of interest often co-crystallize (only)
with the inclusion of a suitable solvent
molecule. - Solvent molecules often fill voids in a structure
with little interaction and located on symmetry
sites and with population less than 1.0 - Often the nature of the (mixture) of included
solvent(s) is unclear. - Inclusion of the scattering contribution of the
solvent can be done either with a disorder model
or with SQUEEZE.
28THE MOLECULE THAT INVOKED THE BYPASS/SQUEEZE TOOL
Salazopyrin from DMF R 0.096
29Structure Modeling and Refinement Problem for the
Salazopyrin structure
Difference Fourier map shows disordered channels
rather than maxima How to handle this in the
Refinement ? SQUEEZE !
30Looking down the Infinite Channels in the
Salazopyrin Structure
How to model this disorder in the L.S-Refinement ?
31The SQUEEZE Tool
- The SQUEEZE tool offers an alternative to the
refinement of a disorder model for a structure
containing disordered solvent. - The contribution of the disordered solvent to the
calculated structure factors is taken into
account by back-Fourier transformation of the
electron density found in the solvent region of
the difference map. - This requires an iterative series of difference
map improvements. - Firstly, the solvent accessible region has to be
indentified to be used as a mask over the
difference density map.
32Solvent Accessible Voids
- A typical crystal structure has only in the order
of 65 of the available space filled. - The remainder volume is in voids (cusps)
in-between atoms (too small to accommodate an
H-atom) - Solvent accessible voids can be defined as
regions in the structure that can accommodate at
least a sphere with radius 1.2 Angstrom without
intersecting with any of the van der Waals
spheres assigned to each atom in the structure. - Next Slide Void Algorithm Cartoon Style ?
33DEFINE SOLVENT ACCESSIBLE VOID
STEP 1 EXCLUDE VOLUME INSIDE THE VAN DER
WAALS SPHERE
34DEFINE SOLVENT ACCESSIBLE VOID
White Area Ohashi Volume. Location of possible
Atom centers
STEP 2 EXCLUDE AN ACCESS RADIAL VOLUME TO
FIND THE LOCATION OF ATOMS WITH THEIR CENTRE AT
LEAST 1.2 ANGSTROM AWAY
35The
DEFINE SOLVENT ACCESSIBLE VOID
STEP 3 EXTEND INNER VOLUME WITH POINTS
WITHIN 1.2 ANGSTROM FROM ITS OUTER BOUNDS
36Listing of all voids in the unit cell
The numbers in refer to the Ohashi Volume
EXAMPLE OF A VOID ANALYSIS
37VOID APPLICATIONS
- Detection of Solvent Accessible Voids in a
Structure - Calculation of Kitaigorodskii Packing Index
- Determination of the available space in solid
state reactions (Ohashi) - Determination of pore volumes, pore shapes and
migration paths in microporous crystals - As part of the SQUEEZE routine to handle the
contribution of disordered solvents in a crystal
structure.
38SQUEEZE
- Takes the contribution of disordered solvents to
the calculated structure factors into account by
back-Fourier transformation of density found in
the solvent accessible volume outside the
ordered part of the structure (iterated). - Refine with SHELXL using the solvent free .hkl
- Or CRYSTALS using the SQUEEZE solvent
contribution and the the full Fobs - NoteSHELXL lacks option for fixed contribution
to Structure Factor Calculation.
39SQUEEZE Algorithm
- Calculate difference Fourier map (FFT)
- Use the VOID-map as a mask on the FFT-map to set
all density outside the VOIDs to zero. - FFT-1 this masked Difference map -gt contribution
of the disordered solvent to the structure
factors - Calculate an improved difference map with F(obs)
phases based on F(calc) including the recovered
solvent contribution and F(calc) without the
solvent contribution. - Recycle to 2 until convergence.
40SQUEEZE In the Complex Plane
Fc(solvent)
Fc(total)
Fc(model)
Fobs
Solvent Free Fobs
Black Split Fc into a discrete and solvent
contribution Red For SHELX refinement,
temporarily substract recovered solvent
contribution from Fobs.
41Real World Example
- THF molecule disordered over a center of
inversion - Comparison of the result of a disorder model
refinement with a SQUEEZE refinement
42Disorder Model Refinement Final R 0.033
43Comparison of the Results of the two Modeling
Procedures
Disorder Model R 0.033
SQUEEZE Model R 0.030
44LISTING OF FINAL SQUEEZE CYCLE RESULTS
45ANALYSIS OF R-VALUE IMPROVEMENT WITH RESOLUTION
AANALYSIS
46Concluding Remarks
- The CSD includes in the order of 1000 entries
where SQUEEZE was used. - Care should be taken for issues such as charge
balance
47Charge Flipping
- Charge Flipping as an alternative for structure
solution by Direct Methods was introduced by G.
Oszlanyi A. Suto (2004). Acta Cryst. A60, 134. - Similar to SQUEEZE it involves iterated forward
and backward Fourier transforms. -
- PLATON implements an experimental version of
Charge Flipping named FLIPPER. - Following is an example of the P21, Z2 structure
of vitamin C solved by FLIPPER starting with all
reflections assigned a phase of zero degrees.
48(No Transcript)
49FLIPPER
- Charge Flipping is done with data in space group
P1. - The space group is determined from the solution
- The methods can be used for automatic structure
determination of non disordered structures - Following is the real time display of the
progress in the development of the structure
after each Fourier cycle, followed a full
refinement.
50(No Transcript)
51Automated Structure Validation
- It is easy to miss problems with a structure as a
busy author or as a referee - Increasingly Black-Box style analyses done by
non-experts - Limited number of referees experts available
- It is easy to hide problems with a ball-and-stick
style illustration - Sadly, fraudulous results and structures have now
been identified in the literature thus
contaminating the assumed solid information in
the CSD.
52Structure Validation with PLATON
- Automated Structure Validation was pioneered and
pushed by Syd Hall as section editor of Acta
Cryst C. by - The creation of the CIF Standard for data
archival and exchange (Hall et al., (1991) Acta
Cryst., A47, 655-685. - Having CIF adopted by Sheldrick for SHELXL93
- Making CIF the Acta Cryst. submission standard
- Setting up early CIF checking procedures for Acta
- Inviting me to include PLATON checking tools such
as ADDSYM and VOID search.
53WHAT ARE THE VALIDATION QUESTIONS ?
- Single Crystal Structure Validation addresses
three simple but important questions - 1 Is the reported information complete?
- 2 What is the quality of the analysis?
- 3 Is the Structure Correct?
54How is Validation Currently Implemented ?
- Validation checks on CIF data can be executed at
any time, both in-house (PLATON/CHECK) or through
the WEB-based IUCr CHECKCIF server. - A file, check.def, defines the issues that are
tested (currently more than 400) with levels of
severity and associated explanation and advise.
(www.cryst.chem.uu.nl/platon/CIF-VALIDATION.pdf) - Most non-trivial tests on the IUCr CheckCIF
server are executed with routines in the program
PLATON. (Identified as PLATxyz)
55VALIDATION ALERT LEVELS
- CheckCIF/PLATON creates a report in the form of a
list of ALERTS with the following ALERT levels - ALERT A Serious Problem
- ALERT B Potentially Serious Problem
- ALERT C Check Explain
- ALERT G Verify or Take Notice
56VALIDATION ALERT TYPES
- 1 - CIF Construction/Syntax errors,
- Missing or Inconsistent Data.
- 2 - Indicators that the Structure Model
- may be Wrong or Deficient.
- 3 - Indicators that the quality of the results
- may be low.
- 4 Info, Cosmetic Improvements, Queries and
- Suggestions.
57PLATON/CHECK CIF FCF Results
58Which Key Validation Issues are Addressed
- Missed Space Group symmetry (being Marshed)
- Wrong chemistry (Mis-assigned atom types).
- Too many, too few or misplaced H-atoms.
- Unusual displacement parameters.
- Hirshfeld Rigid Bond test violations.
- Missed solvent accessible voids in the structure.
- Missed Twinning.
- Absolute structure
- Data quality and completenes.
59Evaluation and Performance
- The validation scheme has been very successful
for Acta Cryst. C E in setting standards for
quality and reliability. - The missed symmetry problem has been solved for
the IUCr journals (unfortunately not generally
yet There are still numerous Marshable
structures). - Most major chemical journals currently have now
some form of a validation scheme implemented. - Recently included FCF validation
60FCF-VALIDATION
- - Check of the CIF FCF data Consistency
(including R-values, cell dimensions) - - Check of Completeness of the reflection data
set. - - Automatic Detection of ignored twinning
- - Detection of Applied Twinning Correction
without having been Reported in the paper. - - Validity check of the reported Flack parameter
value against the Hooft parameter value. - - Analysis of the details of the Difference
Density Fourier Map for unreported features.
61Sloppy, Novice or Fraudulent ?
- Errors are easily made and unfortunately not
always discernable from fraud. - Wrong element type assignments can be caused as
part of an incorrect analysis of an unintended
reaction product. - Alternative element types can be (and have been)
substituted deliberately to create a new
publishable structures. - Reported and calculated R-values differing in the
first relevant digit !?
62Some Relevant ALERTS
- Wrong atom type assignments generally cause
- Serious Hirshfeld Rigid Bond Violation ALERTS
- Larger than expected difference map minima and
maxima. - wR2 gtgt 2 R1
- High values for the SHELXL refined weight
parameter
63Acta Cryst. (2007), E63, m1566.
Sn(IV)(NO3)4(C10H8N2)2
642.601 Ang.
Missing H in bridge Sn(IV) gt Lanthanide(III)
65The Ultimate Shame
- Recently a whole series of isomorphous
substitions was detected for an already published
structure. - Similar series have now been detected for
coordination complexes (Transition metals and
lanthanides) - How could referees let those pass ?
- Over 100 structures now retracted
- Fraud detected by looking at all papers of the
same authors of a strange structure (and their
institutions)
66BogusVariations (with Hirshfeld ALERTS) on the
Published Structure 2-hydroxy-3,5-nitrobenzoic
acid (ZAJGUM)
67Comparison of the Observed data for two
isomorphous compounds.
Tool platon d name1.fcf name2.fcf
Conclusion The Same Data !
The Only Difference Is the SCALE !
SLOPPY Or FRAUD ?
68Thanks !
- My former co-workers over 38 years and in
particular my successor Dr. Martin Lutz - Dr. Louis Farrugia for following my frequent
updates with his MS-Windows implementation - The users of the software for ideas and bug
reports. - Lachlan Cranswick for promoting my software and
who is sadly no longer with us here.
69IUCr Crystallographic Computing School 2005 Siena
70(No Transcript)