D2OL and Community TSC - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

D2OL and Community TSC

Description:

Designed to harness underutilized computing power for computationally ... Crossover (performed on random members, user defined rates, two-point crossover ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 24
Provided by: jwhe6
Category:

less

Transcript and Presenter's Notes

Title: D2OL and Community TSC


1
D2OL and Community TSC
  • Informatics Staff
  • Wolfgang Hinz
  • Jian-Quan Chen
  • Ken Weilbacher
  • Daniel Yu

2
Presentation Outline
  • Introduction to D2OL
  • AutoDock 3.0
  • Community TSC
  • D2OL SARS, Bio-terrorism
  • Libraries
  • Results
  • The Project in Numbers
  • D2OL Version 2.0

3
Introduction to D2OL
  • D2OL is a distributed computing platform for
    Flexible Ligand Docking on PCs

  • Designed to harness underutilized computing
    power for computationally demanding work
  • Allows the fast and efficient screening of
    millions of compounds against
  • hundreds of targets
  • Allows quick optimization of parameters based
    on large data sets

4
AutoDock 3.0
  • D2OL is based on AutoDock 3.0
  • Two broad categories for automated docking
  • Matching Methods create a model of the active
    site, typically including sites of hydrogen
    bonding and steric accessibility then attempt to
    fit body to structure of active site by matching
    its geometry (e.g. Dock)
  • Docking Simulation The ligand begins randomly
    outside the pocket and the program explores
    translations, orientations and conformations
    until an ideal state is found (e.g. AutoDock)
  • Disadvantages of Docking Simulations
  • Generally slower
  • Advantages of Docking Simulations
  • Utilize more detailed molecular mechanics to
    calculate the energy of ligand in the context of
    the putative active/binding site
  • Molecular docking is a difficult optimization
    problem, requiring efficient sampling across the
    entire range of positional, orientational, and
    conformational possibilities.

5
AutoDock 3.0
  • Hybrid Search Methods in AutoDock
  • Genetic algorithms for global search aspects
  • Local search methods for refined search aspects
  • A combination of both methods is used, and is
    generally referred to as a Lamarckian genetic
    algorithm
  • This method applies a Lamarckian model of
    genetics, in which environmental adaptations of
    and individuals phenotype are reverse
    transcribed into its genotype and become
    inheritable traits.
  • Chromosome string of real-valued genes
  • Three Cartesian coordinates for ligand
    translation
  • Four variables defining a quaternion for ligand
    orientation
  • One real-value for each ligand torsion obtained
    (defined by the torsion tree which is generated
    by AutoTors)
  • There is a one-to-one mapping from the ligands
    state variables to the genes of the individuals
    chromosome.

6
AutoDock 3.0
  • A generation consist of the following steps
  • Mapping and fitness evaluation
  • Selection (which individuals will reproduce)
  • Crossover (performed on random members, user
    defined rates, two-point crossover with breaks
    between genes)
  • Mutation (addition of a random real number with
    Cauchy distribution)
  • Elitism (user defined number of top candidates
    that automatically make it into the next
    generation)

7
Community TSC
  • Potential drug targets in PI3K-Akt-Tuberin
    Pathway
  • 4 Targets have been identified so far
  • (PTEN)
  • FRAP (mTOR)
  • EIF4E
  • PI3K
  • A fifth target is being prepared (Akt)
  • The area of interest for small molecule docking
    is defined by active sites identified for each
    target.

8
D2OL SARS, Bioterrorism
  • Targets originate from potential biological
    warfare agents and include
  • Anthrax Lethal Factor
  • Smallpox Topoisomerase
  • Ebola
  • Recent outbreak of SARS gave opportunity to help
    with development of antiviral agents
  • First target released was TGEV (Pig Corona virus
    Main Protease)
  • Second target released was Homology Structure
    derived from TGEV and Human Corona virus Main
    Protease
  • Third and fourth target to be released next week
    will include Human Corona virus Main Protease and
    Rhinovirus Main Protease.

9
Libraries
  • Libraries are prioritized according to
    accessibility
  • Immediately accessible to TRI are
  • NCI Open compound library
  • ICCB Commercial Libraries
  • Recipe for Library preparation
  • 2D to 3D conversion
  • Removal of counter ions
  • Molecular weight filter (200-700)
  • Removal of compounds with elements other than C,
    N, S, O, H, P, F, Cl, Br, I
  • Removal of compounds that do not contain H and C
  • Add partial charges using AutoDock tools
  • Covert files from mol2 to pdbq (AutoTors)
  • Remove compounds that have more than 16
    rotateable bonds
  • Generate Docking- and Grid Parameter Files

10
Results
  • Target PI3K (Community TSC)
  • Candidates 22,000 NCI Compunds
  • 6 Control compounds (known inhibitors)

11
Results
  • The final model used in the evaluation of the
    docking energy in AutoDock 3.0 has a residual
    standard error of 9.11 kJ mol-1 (2.177 kcal mol-1)

12
The Project in Numbers
  • Community TSC
  • 15,000 registered users
  • Daily average of 50 new registrations (peak
    125/day)
  • 1,500 nodes on-line per day (peak 2400)
  • 12,000 downloads in total (this figure is lower
    due to mirror sites)
  • - Daily average of 75 new downloads
  • 500,000 candidates processed per week (all
    targets)
  • 4 Targets deployed
  • NCI Library of 180,000 deployed for all targets

13
The Project in Numbers
  • The distribution of Operating Systems
  • (Community TSC)
  • For 112,011 docking results returned
  • Windows 107,930
  • MacOSX 515
  • Linux 3566
  • Solaris 0

Percentage of Operation System
14
The Project in Numbers
  • D2OL SARS/Bioterrorism
  • 40,000 registered users (24,000 have registered
    since release of SARS candidates)
  • 1,000 new registrations per day (peak 3,000)
  • 4,000 users on-line per day (700 before SARS)
  • Peak values 5, 500 (1,100 before SARS)
  • 85,000 downloads (57,000 since release of SARS)
  • 1,700 downloads per day (peak once at 8300 and
    twice at 6,000)
  • 900,000 candidates processed per week (all
    targets)
  • 5 Targets deployed
  • NCI Library of 180,000 deployed for all targets

15
D2OL Version 2.0
  • Completely redesigned system
  • the new design is intended to remove all
    possible software related bottlenecks. The
    scalability of the project will only depend on
    the number of servers and the bandwidth available
    to the project.
  • Bandwidth requirements will be further scaled
    down by preprocessing results on the client.
  • Improved monitoring capabilities.
  • Improved statistics.
  • Results analysis system (TRI Development).
  • Improved documentation of code and system
    architecture.

16
Acknowledgements
  • Sengent
  • Scott McFarland
  • Anthony Glaviano
  • Antonio Pila
  • Gioel Molinari
  • The Rothberg Institute
  • Bonnie Rothberg
  • Jonathan Rothberg
  • Yale University
  • Tian Xu

17
An Environment for Creativity
18
Chemistry
  • Wolfgang Hinz

19
Chemistry Lead Compound
  • Lead obtained from primary screen of the
    Microsource library.
  • Lead confirmed by several secondary screens
    including dose response

20
Chemistry - Analogues
  • Three analogues were synthesized to explore the
    activity of puromycin.

21
Chemistry - Analogues
22
Chemistry - Analogues
23
An Environment for Creativity
Write a Comment
User Comments (0)
About PowerShow.com