Title: Drug Discovery Grid -- A real grid application
1Drug Discovery Grid-- A real grid application
Zhang Wenju, Shen Jianhua Shanghai Institute of
Materia Medica, CAS Shanghai Jiaotong University
Jiangnan Institute of Computing The University
of Hong Kong
2Agenda
- DDGrid Introduction
- DDGrid Architecture
- DDGrid Application
- DDGrid Demo
3Background
Large-scale High-throughput Virtual Screening
- in Silico
- The computational analysis of chemical databases
to identify compounds appropriate for a given
biological receptor - in Vitro
- Identification of new compounds showing some
activity against a target biological receptor,
and the progressive optimization of these leads
to yield a compound with improved potency and
physicochemical properties in vitro - in Vivo
- eventually, improved efficacy, pharmacokinetics,
and toxicological profiles in vivo.
4Process of Drug Discovery and Design
Leads and Opt.
2-3 years
2-3 years
Random Screening 10, 000 20, 000 Compounds
Pre-clinic
Drug Candidate
Computer-Aid Drug Design
2-3 years
Clinic (phase I, II, III)
3-4 years
- Time 10-12 years
- Money several billion dollars
Market
5DDGrid overview
? Drug Discovery Grid project aims to build a
collaboration platform for drug discovery using
the state-of-the-art grid computing technology.
? This project intends to solve large-scale
computation and data intensive scientific
applications in the fields of medicine chemistry
and molecular biology with the help of grid
middleware developed by our team. ? Over one
million compounds database with 3-D structure and
physicochemical properties are also provided to
identify potential drug candidates. Users also
can build and maintain their own customized
ligand database to share in this grid platform.
6DDGrid Architecture
7DDGrid Architecture
8DDGrid Architecture
9DDGrid Architecture
10DDGrid Workflow
Job Submit
ID and Result Return
Global Server (Monitoring, Work Pool, Resource
Manag., Assimilate of Result)
Return of Result, New job request
Job Dispatch
xml
Slave Server (Local Resource Manag., Monitoring,
Local Work Pool, Assimilate of Result)
Return of Result, New job request
Job Dispatch
Computational Client (Docking)
11DDGrid security
1. PKI-based security 2. All the sites involved
should hold a certification issued by our CA 3.
All the databases deployed and results are
encrypted 4. All the message passing are
SSL/TLS-enabled
12DDGrid Web Portal
13Test Case 1
- Virtual Screening from 20,000 compounds
- Involved Sites
- Shanghai Inst. of M. M. (SIMM) Alpha
Cluster (32CPU) - Beijing Mol. Ltd. Sunway Cluster (224CPU)
- The Univ. of Hong Kong Gideon Cluster (16CPU)
- Shanghai SuperComp. Centre Dawning 4000A
- Dalian Univ. of Tech. Dawning 4000A
- London e-Science Centre Mars Cluster
- Time consumed
- 5946 sec(appr. 99 min)
- Data Sets (CDB)
- Specs
14Job scheduling
15Visualisation of Docking Result
16DDGrid message passing
ltscheduler_requestgt ltauthenticatorgt3333lt/aut
henticatorgt lthostidgt102lt/hostidgt
ltrpc_seqnogt2401lt/rpc_seqnogt
ltplatform_namegti686-pc-linux-gnult/platform_namegt
ltcore_client_major_versiongt2lt/core_client_maj
or_versiongt ltcore_client_minor_versiongt19lt/c
ore_client_minor_versiongt
ltidle_ncpugt16lt/idle_ncpugt ltproject_disk_usage
gt5315768.000000lt/project_disk_usagegt
lttotal_disk_usagegt68417940.000000lt/total_disk_usag
egt ltcode_sign_keygt lt/code_sign_keygt
ltprojectsgt ltprojectgt
ltmaster_urlgthttp//www.ddgrid.ac.cn/ddg/lt/master_u
rlgt ltresource_sharegt100.000000lt/re
source_sharegt lt/projectgt
lt/projectsgt ltresultgt lt/resultgt
lthost_infogt lt/host_infogt lt/scheduler_requestgt
17DDGrid message passing
ltscheduler_replygt ltmessage priority"low"gtNo
work availablelt/messagegt ltproject_namegtDdglt/p
roject_namegt ltuser_namegtssslt/user_namegt
ltcode_sign_keygt lt/code_sign_keygt
ltworkunitgt lt/workunitgt
ltpreferencesgt ltlow_water_daysgt1.2lt/l
ow_water_daysgt lthigh_water_daysgt2.5lt
/high_water_daysgt
ltdisk_max_used_gbgt0.4lt/disk_max_used_gbgt
ltdisk_max_used_pctgt50lt/disk_max_used_pctgt
ltdisk_min_free_gbgt0.4lt/disk_min_free_g
bgt lt/preferencesgt
lt/scheduler_replygt
18DDGrid message passing
ltworkunitgt ltfile_infogt ltnumbergt0lt/numbergt
lt/file_infogt ltfile_infogt
ltnumbergt1lt/numbergt lt/file_infogt
ltfile_infogt ltnumbergt2lt/numbergt
lt/file_infogt ltfile_refgt
ltfile_numbergt0lt/file_numbergt
ltopen_namegttabfilelt/open_namegt lt/file_refgt
ltfile_refgt ltfile_numbergt1lt/file_numbergt
ltopen_namegtinfilelt/open_namegt
lt/file_refgt ltfile_refgt
ltfile_numbergt2lt/file_numbergt
ltopen_namegtsphfilelt/open_namegt lt/file_refgt
ltcommand_linegt-businesslt/command_linegt lt/workunit
gt
19DDGrid message passing
ltprojectgt ltscheduler_urlgthttp//www.ddgrid.a
c.cn/ddg_cgi/cgilt/scheduler_urlgt
ltmaster_urlgthttp//www.ddgrid.ac.cn/ddg/lt/master_u
rlgt ltproject_namegtDdglt/project_namegt lt/project
gt ltappgt ltnamegtgridapplt/namegt lt/appgt ltfile_info
gt ltnamegtgridapp/gridapp_2.19_i686-pc-linux-gnu
lt/namegt ltnbytesgt260754.000000lt/nbytesgt
ltmax_nbytesgt0.000000lt/max_nbytesgt
ltexecutable/gt ltsignature_required/gt
ltfile_signaturegt lt/file_signaturegt
lturlgthttp//www.ddgrid.ac.cn/ddg/download/gridapp_
2.19_i686-pc-linux-gnult/urlgt lt/file_infogt ltfile_in
fogt lt/file_infogt
20DDGrid Resources
Computational and Data Resources
Integration Resources aggregated SIMM Sunway
32A Cluster Beijing Molecule Inc. Sunway 256P
Cluster HKU Gideon 300 Cluster SSC Dawning
4000A LeSC Mars Cluster (Test only) Singapore
Poly-tech Univ. Dalian Univ. of
Technology Shanghai Jiaotong Univ. Heterogeneous
resources OS IRIX, Digital Unix, Linux(IA32,
x86_64) CPUR12000, Alpha, Pentium, AMD
21DDGrid Resources
- DDGrid Apps.
- Docking pre-process software
- Combimark
- 2. Docking software
- 1) Dock UCSF
- 2) gsDock SIMM
- 3. CDB build and maintain S/W
- Combilib
- 4. AutoDock
- 5. AutoGrid
- 6. Visualisation
- 7. Security-related tools
22DDGrid Resources
Chemical Databases (CDB) Each ligand record
in a chemical database represents the 3D
structural information of a compound. The numbers
of compounds in each CDB can be in the order of
tens of thousands and the database size be
anywhere from tens of megabytes to gigabytes and
even terabytes. 1. static databases purchased
from commercial chemical company. Available
Chemical Directory (ACD) Chinese natural
product database (CNPD) SPECS
database chemical ADME/T database, etc. 2.
dynamic databases made by user own, and deployed
automatically.
23Deployed commercial CDB (appr.700,000)
Name of Database Description
Specs Provides about 230,000 compounds
CMC-3D Provides 3D models and important biochemical properties (including drug class, logP, and pKa values) for over 8,400 pharmaceutical compounds.
ACD-3D Provides 200,000 3D compounds commercial available
NCI-3D 213,000compounds with 2D information from the National Cancer Institute
CNPD Collected 12,000 Chinese natural products with chemical structure
TCMD With 9127 compounds and 3922 herbs
24appr. 3,300,000 compounds
Vendor Num. of Mol. Vendor Num. of Mol.
ACB-Eurochem 98603 Maybridge 53042
Ambinter 533866 Nanosyn 68317
Asinex 293385 National Cancer Institute 223536
ChemBridge 562624 Otava 181195
ChemDiv 361859 Peakdale 9632
ComGenex 38590 Pharmeks 116355
Enamine 533111 PubChem 164031
IBScreen 452728 Ryan Scientific 64205
InterChim 288882 Sigma-Aldrich 49022
KeyOrganics 22294 Specs 307550
Life Chemicals 44762 TimTec 127173
25CDB exampleCNPD-China Natural Products Database
26CDB exampleCNPD
CNPD The first and only comprehensive source of
chemical, structural and bibliographic data on
all known natural products in China. CNPD serves
as information sources for chemical, physical and
biological properties, literature, they are
useful to scientists within the pharmaceutical
industry. CNPD can be searched in flexible ways
structure, sub-structure, name, molecular
formula, molecular weight, CAS register number,
category, etc. CNPD Traditional Chinese
Medicine (TCM) applications are pre-indexed in
CNPD to provide hints for lead compounds
discovery.
27CDB exampleCNPD
28CDB exampleTCMD
TCMD-Traditional Chinese Medicine Database
TCMD is a bibliographical database of
approximately 20,000 records with abstracts of
TCM articles. Relevant articles are selected from
among 150-200 journals from Mainland China,
Taiwan, and Hong Kong (most of them are Chinese)
English abstracts are written for the selected
articles and other pertinent information is
translated into English.
29CDB exampleTCMD
30DDGrid applications in reality
- SIMM carried out anti-SARS and anti-diabetes
drug research using the DDGrid - Anti-SARS drug research
- Anti-diabetes drug research
31Research on Anti-SARS medicine
Virtual screening from Comprehensive Medicinal
Chemistry-3D (CMC-3D) database which contains
7,900 compounds, found that cinanserin have
distinct anti-SARS effect Department of
Virology, Bernhard-Nocht-Institute for Tropical
Medicine, Germany Research Department, Cantonal
Hospital St Gallen, Switzerland Basically your
inhibitor turned out to be the best compound we
have tested so far! Have applied for domestic
patent 03129071.x and PCT patent pi034248
32Research on anti-diabetes medicine
Found an anti-diabetes lead better than
Rosiglitazone. by targeting on PPAR,through
virtual screening, optimization design and
synthesis and biology and pharmacology testing
CADD process
33Research on anti-diabetes medicine
2.4 m
400 t
10 t
composite design
virtual screening
virtual screening
500
manually screening
85
synthesis
142
48 KDlt1 mM 22 KDlt0.1mM
protein testing
76
protein testing
48
KDlt100mM
cell testing
8
animal testing
4
1
comprehensive evaluation
34New anti-diabetes drug
Current Progress 1. Applied for patent
200410016460.X,and PCT patent 2. Security
testing and pre-clinic research
35What does the DDGrid provide?
1? Drug Design Collaboration Platform Large-scale
Virtual Screening platform sharing large
CDB 2?Computational Resources Sharing SIMM/SSC/HK
U/Mol. Ltd/SJTU/DUT 3?Data Resources
Sharing pre-deployed commercial CDB (ACD/CNPD
) sharing self-made CDB 4?Medicinal chemistry
text and structure search 5?Customization and
Extension
36Collaboration
Selected Users of DDGrid
37DDGrid Demo
Demo
http//www.ddgrid.ac.cn
38Demo
39Demo
40Demo
41Demo
42Demo
43Thank you!
QA