e-Science Technologies in the Simulation of Complex Materials - PowerPoint PPT Presentation

1 / 55
About This Presentation
Title:

e-Science Technologies in the Simulation of Complex Materials

Description:

... may exist as two or more crystalline phases in which the molecules are packed differently. ... Handling of data and analysis becomes RDS. ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 56
Provided by: FREN7
Category:

less

Transcript and Presenter's Notes

Title: e-Science Technologies in the Simulation of Complex Materials


1
e-Science Technologies in the Simulation of
Complex Materials
L. Blanshard, R. Tyer, K. Kleese
S. A. French, D. S. Coombes, C. R. A. Catlow
B. Butchart, W. Emmerich CS H. Nowell, S. L.
Price Chem
eMaterials
2
Polymorphism
prediction of polymorphs a drug substance may
exist as two or more crystalline phases in which
the molecules are packed differently.
Combinatorial Computational Catalysis
explore which sites are involved in catalysis
used in diverse industries including petroleum,
chemical, polymers, agrochemicals, and
environmental.
3
Combinatorial Computational Catalysis
explore which sites are involved in catalysis
used in diverse industries including petroleum,
chemical, polymers, agrochemicals, and
environmental.
4
e-Science Issues to Address
  • simulations take too long to run
  • data are distributed across many sites and
    systems
  • no catalogue system
  • output in legacy text files, different for each
    program
  • few tools to access, manage and transfer data
  • workflow management is manual
  • licensing within distributed environment

5
Acid Sites in Zeolites
  • Determine the extra framework cation position
    within the zeolite framework.
  • Explore which proton sites are involved in
    catalysis and then characterise the active sites.
  • To produce a database with structural models and
    associated vibrational modes for Si/Al ratios.
  • Improve understanding of the role of the Si/Al
    ratio in zeolite chemistry.

6
Chabazite 1T site, 12 Si centres per unit cell,
8 membered ring channels (3.8Å 3.8Å).
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
The Problem
Si/Al 11 4 Si/Al 5 160 Si/Al 3
5760 Si/Al 2 184,320 The number of
calculations quickly becomes an issue when
realistic Si/Al ratios are considered. A Si/Al
ratio of 2 would require 184,320 calculations at
100 second each. 5120.0 hours 213 days of
cpu time.
When substitution of a second Al is considered
there are now 4 (10 4) possible structures as
symmetry has been broken.
Note this is for a very simple zeolite with 36
ions per unit cell, materials of interest have
296.
12
MC/EM
A combined MC and EM approach has been developed
to model zeolitic materials with low and medium
Si/Al ratios. Firstly Al is inserted into a
siliceous unit cell and then charge compensate
with cations.
13
RI Condor Pool
  • Name OpSys Arch
    State Activity LoadAv Mem ActvtyTime
  • vm1-8_at_faraday.r IRIX65 SGI Owner
    Idle 1.192 128 3030102
  • vm1-14_at_tyndall.r IRIX65 SGI Unclaimed
    Idle 0.000 507 0001509
  • ising2.ri.ac. LINUX INTEL
    Unclaimed Idle 0.200 501 ?????
  • vm1-16_at_strutt1-4 OSF1 ALPHA Owner
    Idle 1.113 1024 002646
  • xp2.ri.ac.uk OSF1 ALPHA Owner
    Idle 1.113 256 49122646
  • xp3.ri.ac.uk OSF1 ALPHA
    Unclaimed Idle 0.000 256 0005500
  • d8.ri.ac.uk WINNT40 INTEL
    Unclaimed Idle 0.000 255 0020945
  • ATLANTIC WINNT51 INTEL
    Unclaimed Idle 0.008 256 0010230
  • BABBLE.ri.ac. WINNT51 INTEL
    Unclaimed Idle 0.252 512 0002257
  • D500.ri.ac.uk WINNT51 INTEL Owner
    Idle 0.533 254 0052606
  • PCDAVIDC.ri.a WINNT51 INTEL Unclaimed
    Idle 0.000 504 0035126
  • e-sam.ri.ac.u WINNT51 INTEL
    Unclaimed Idle 0.001 512 0031639
  • pcalexey.ri.a WINNT51 INTEL
    Unclaimed Idle 0.002 256 0003553
  • Machines Owner Claimed
    Unclaimed Matched Preempting
  • ALPHA/OSF1 18 1 0
    1 0 0

We have set up and tested a Condor pool at the
RI, which has 50 heterogeneous nodes from
desktop PCs, machines controlling instruments to
main servers of the DFRL.
14
RI Condor Pool
  • Name OpSys Arch
    State Activity LoadAv Mem ActvtyTime
  • vm1-8_at_faraday.r IRIX65 SGI Owner
    Idle 1.192 128 3030102
  • vm1-14_at_tyndall.r IRIX65 SGI Unclaimed
    Idle 0.000 507 0001509
  • ising2.ri.ac. LINUX INTEL
    Unclaimed Idle 0.200 501 ?????
  • vm1-16_at_strutt1-4 OSF1 ALPHA Owner
    Idle 1.113 1024 002646
  • xp2.ri.ac.uk OSF1 ALPHA Owner
    Idle 1.113 256 49122646
  • xp3.ri.ac.uk OSF1 ALPHA
    Unclaimed Idle 0.000 256 0005500
  • d8.ri.ac.uk WINNT40 INTEL
    Unclaimed Idle 0.000 255 0020945
  • ATLANTIC WINNT51 INTEL
    Unclaimed Idle 0.008 256 0010230
  • BABBLE.ri.ac. WINNT51 INTEL
    Unclaimed Idle 0.252 512 0002257
  • D500.ri.ac.uk WINNT51 INTEL Owner
    Idle 0.533 254 0052606
  • PCDAVIDC.ri.a WINNT51 INTEL Unclaimed
    Idle 0.000 504 0035126
  • e-sam.ri.ac.u WINNT51 INTEL
    Unclaimed Idle 0.001 512 0031639
  • pcalexey.ri.a WINNT51 INTEL
    Unclaimed Idle 0.002 256 0003553
  • Machines Owner Claimed
    Unclaimed Matched Preempting
  • ALPHA/OSF1 18 1 0
    1 0 0

But where is PC-CRAC???
15
Level of Optimisation
50eV
16
Level of Optimisation
240eV
17
MOR
  • Mordenite
  • 1 dimensional channel system
  • simulation cell contains two unit cells
  • 296 atoms, with 96 Si centres (referred to as T
    sites).
  • Substituting 8 T sites with 8 Na cations

18
Workflow
MC_subs
Gulp Files
Gulp WinXP
Perl script
MS Excel
SRB
19
Workflow II
C
MC_subs
Si-zeo structure Interatomic pots Input file
Gulp Files
Batch of labelled Gulp files
Script auto batch sub Script for cleaning dirs
Gulp WinXP
Perl script
f90
Subset of data in formatted file
Scommands
MS Excel
SRB
20
Condor Stats
Extensive use of Condor pools (UCL 950 nodes in
teaching pools). 150 cpu-years of previously
unused compute resource have been utilised in
this study. Close collaboration with the NERC
e-minerals project has allowed access to this
resource. 150,000 calculations have been
performed each with varying numbers of particles
per simulation box, which means a total of
75,000,000 particles have been included in our
simulations of Mordenite to date.
21
Condor Specifics
Jobs submitted in 1,000 job batches issue of
stability. Shadows not my game but a pain when
Condor Master dies due to too many jobs hitting
the queue (guilty feeling as Master was not
solely running pool but also being used for
science by pool administrator. Maximum number of
jobs in queue.
22
Condor Specifics
Handling of data and analysis becomes
RDS. However, keeping the pool full of jobs is
also a tedious step when jobs are short, which is
the ideal for the UCL pool (re turning off pool
once a day) drip feeding.
Thought in application design is key many on
UCL pool are TOTALLY unsuitable for UCL Condor
Pool.
23
MOR
  • Mordenite
  • 1 dimensional channel system
  • simulation cell contains two unit cells
  • 296 atoms, with 96 Si centres (referred to as T
    sites).
  • Substituting 8 T sites with 8 Na cations

24
100 Configurations
0
100
20eV
It can be seen that there are two distinct
regions, -12079eV to -12076eV and -12075eV to
-12073eV, but there is no obvious correlation
between total energy and cell volume.
25
10000 Configurations
0
10000
25eV
However, when 10,000 structures are considered it
is clear that the most stable structures
correspond to cation placements that do not cause
the cell to expand. This requires that the
cations sit in the large channel.
26
10000 Configurations
27
Comparison of Regions
-12079.5eV
-12075.04eV
28
Analysis
mysql, allows input from a text file, C/C
program or mysql command line and GUI
29
Workflow III
MC_subs
Gulp Files
Gulp WinXP
mysql
db
SRB
30
Building an Ensemble
 
 
31
Validation
Comparison with experiment is very promising
showing a large difference in the quality of the
fit between good set and bad.
32
Monitor
33
Drip Feeding and Interactive Steering using
Relational Databases
Distributed Computing Portal
User Input Structural model Si/Al, cation types,
H2O etc.
Model/Configuration Generator
Jobs
db
Analysis(geometry, energy, fit)
Steering
db
Improve generation / modelstrategy
Analysis
db
User Input Diffraction data, chemical analysis,
building units, Si/Al, cation types, H2O etc.
D. Lewis, R. Coates, S. French UCL Chem / RI
34
Workflow IV
Workflow service needs to be exposed to outside
world as a web service
SSH
CML
CML
Since we require new WSDL interfaces for each
application it is a perfect opportunity to employ
a standard representation for chemical
structures. XML standard in Chemistry is CML
(Chemical Markup Language)
CML
35
Key Achievement
We are now doing science that was not possible
before the advancements made within e-Science.
36
(No Transcript)
37
FER
  • Ferrite
  • 2 dimensional channel system
  • simulation cell contains 115 atoms.
  • substituting at 4 T sites with 4 Na cations

38
100 Configurations
14eV
Again there are steps in Total Energy and again
this time no correlation with volume for the low
number of configurations.
Only 75 out of 100 configurations optimise
39
10000 Configurations
15eV
However, this time when 10,000 structures are
considered there are no clear steps in the
volume. The volume still increases with
decreasing stability but this is due to cell
expansion caused by Al to Al interactions.
Only 7500 out of 10000 optimise
40
Comparison of Regions
41
Comparison of Regions
42
MFI
  • ZSM5
  • 3 dimensional channel system
  • simulation cell contains 292 atoms
  • substituting at 4 sites with 4 Na cations

43
10000 Configurations
10eV
There is a step in Total Energy but this time
only one and from then the trend is smooth.
44
What Next
When confirmed the lowest energy positions of Al
the cation is exchanged for a proton and again
energy minimised. This method will allow us to
construct realistic models of low and medium
Si/Al zeolites. Such structures can be used for
further simulations and aid the interpretation of
experimental data.
45
Solid Solutions
BaTiO3
46
Solid Solutions
BaSrTiO3
47
Solid Solutions
SrTiO3
48
Ongoing and Future Work
  • upload files as part of workflow to SRB
  • generate metadata
  • upload extracted data from files
  • more extensive use of CML

49
Key Achievement
We are now doing science that was not possible
before the advancements made within e-Science.
50
(No Transcript)
51
Achievements To Date
1. First use of CML schema for defining Web
Service port types. 2. Calculation of 50,000
configurations of zeolite Mordenite (24,000,000
particles) to gain insight into structure when a
realistic ratio of Al substitution is included in
model. 3. Successfully exposed Fortran codes as
OGSI Web Services - prototype application
deployed on 80 nodes. The prototype computational
polymorph application is being ported to a larger
production machine. 4. First use of BPEL standard
for orchestrating web services in a Grid
application. 5. Open Source BPEL implementation
in development enabling late binding and dynamic
deployment of large computational processes. 6.
Integration of OGSI and BPEL with Sun Grid
Engine. 7. Development of Graphic User Interface
for polymorph application - connects to
relational database via EJB interface. 8. Infrastr
ucture for metadata and data management 9. SRB
and dataportal are already being used to hold
datasets and being used for transferring the data
between different scientists and computer
applications. 10. Implementation of Condor pool
at Ri.
52
Polymorph Prediction
  • Different crystal structures of a molecule are
    called polymorphs.
  • Polymorphs may have considerably different
    properties
  • (e.g. bioavailability, solubility, morphology)
  • Polymorph prediction is of great importance to
    the pharmaceutical industry where the discovery
    of a new polymorph during production or storage
    of a drug may be disastrous

Drug molecules are often flexible and this makes
the polymorph prediction process more challenging
53
Polymorph Prediction Workflow
For flexible molecules conformational
optimisation n feasible rigid molecular probes
representing energetically plausible conformers
MOLPAK Generation of 6000 densely packed
crystal structures using rigid molecular probe
n times
DMAREL Lattice energy optimisation
Morphology
n number of conformers
Data Unit cell volume, density, lattice energy
Restricted number of structures selected
crystal structures and properties stored in
Database
54
Storage Resource Broker
Store data files from simulations in the Storage
Resource Broker
55
Key Achievement
We are now doing science that was not possible
before the advancements made within e-Science.
Write a Comment
User Comments (0)
About PowerShow.com