A Genetic Algorithm for Flexible Molecular Overlay and Pharmacophore Elucidation - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

A Genetic Algorithm for Flexible Molecular Overlay and Pharmacophore Elucidation

Description:

Each chromosome can be assigned a fitness value, which is a measure of how good ... GA chromosome contains conformational information in binary bitstrings and ... – PowerPoint PPT presentation

Number of Views:163
Avg rating:3.0/5.0
Slides: 42
Provided by: louisj2
Category:

less

Transcript and Presenter's Notes

Title: A Genetic Algorithm for Flexible Molecular Overlay and Pharmacophore Elucidation


1
A Genetic Algorithm for Flexible Molecular
Overlay and Pharmacophore Elucidation
  • Gareth Jones
  • OpenEye Cup Feb 22, 2005, Santa Fe NM

2
What is a GA?
  • A Genetic Algorithm is based on the process of
    Darwinian Evolution and is good for complex
    search problems.
  • A Chromosome is an encoded representation of a
    problem, typically an integer or binary string.
  • A chromosome encodes the conformation of each
    active and the superimposition of the set.
  • Each chromosome can be assigned a fitness value,
    which is a measure of how good a solution the
    chromosome is.
  • A GA starts with a random population of
    chromosomes.
  • Genetic operators such as crossover or mutation
    are repeatedly applied to the population.
  • Over time the average fitness of the population
    increases.

3
Overview of the GA
4
Roulette Wheel Parent Selection.
  • Each member of the population gets a slice of the
    roulette wheel that is proportional to it's
    (ranked) fitness i.e. selection is biased
    towards fitter individuals

5
Genetic Operators
  • Mutation. Randomly choose some portion of the
    chromosome string and randomly alter it.
  • Crossover. Choose a random cross point after
    which the string is copied from the other parent.
  • Mixing. Crossover for sparse strings

6
Genetic Algorithm for Pharmacophore Elucidation
  • GAPE
  • Based on GASP
  • Distributed by Tripos Associates
  • G. Jones, P. Willett and R. C. Glen, Journal of
    Computer-Aided Molecular Design 9 (1995) p532
  • Incorporates ideas from GOLD
  • Distributed by CCDC
  • G. Jones, P. Willett, R. C. Glen, A. R. Leach and
    R. Taylor, Journal of Molecular Biology 267
    (1997) p727
  • And new stuff
  • Java code base (some JNI)

7
Molecular Overlay and Pharmacophore elucidation
  • The problem is to overlay a series of actives
    (without any prior knowledge of the
    pharmacophore) such that common molecule receptor
    interactions can be identified.
  • We use a GA to search simultaneously the
    conformational space of a series of actives as
    well as the possible superimpositions of each
    conformation.

8
Design of the Algorithm
  • Identify important features in each molecule and
    label them.
  • GA chromosome contains conformational information
    in binary bitstrings and encodes mappings as
    integer strings.
  • When decoding select one molecule as the base
    molecule (the one with the most features) and use
    GA mappings to fit other molecules on top of the
    base molecule.
  • Fitness function comprises a score for the
    overlay of similar features, internal steric
    energies and common molecular volume.
  • Similarity scores based on hydrogen bond
    distributions (Mills and Dean).

9
Chromosome Encoding
  • An overlay of N molecules requires a chromosome
    of 2N-1 strings.
  • N binary strings encode conformational
    information one string for each molecule. Each
    byte encodes an angle of rotation about one
    rotatable bond and bits are used to flip free
    corners
  • Integer strings map each molecule onto the
    fitting molecule.

10
Chromosome Decoding
  • The N binary strings are decoded to produce
    molecular conformations for each molecule.
  • Each of the N-1 integer strings is decoded
  • Two passes of least-squares fitting is used to
    overlay molecules on top of the base molecule.
    Fitting is applied so that functional groups are
    positioned to interact with the same point in the
    receptor. The chromosome is then rebuilt.

11
GA Mapping
12
Scoring Function
13
Geometric Weight
  • R (fitting radius) is annealed from 3.5 to 1.5
  • Additional geometric constraints (linear drop-off
    between maximum and minimum angles)
  • Acceptor geometries LP, Plane, Cone, None
  • Donor, fitting point, donor angle.
  • Desolvation correction

H1
r12
R
D2
H2
D1
14
Donor/Acceptor Weights
  • J.E.J Mills and P.M. Dean, Three-Dimensional
    Hydrogen-bond Geometry and Probability
    Information from a Crystal Survey, Journal of
    Computer-Aided Molecular Design 10 (1996)
    pp.607-622.
  • Ionization term

15
Other Terms
  • Volume Integral
  • Uses atomic Gaussians
  • Grant Pickup J. Phys. Chem. 1995, v99, p3503.
  • VDW
  • Parameters from TAFF
  • Clark et al. J. Comp. Chem. 1989 v10 p892.
  • User defined Feature Sets

16
Torsional Distributions
  • Restrict conformational search to observed
    torsions
  • Examples from GOLD
  • Can also use NUMIMBA Klebe and Meitzer, J.
    CAMD, 1994, V8, p583

17
Incorporation of Activities
  • Acitivities can guide creation of the overlay
  • Use pKi to specify activity.
  • Weight pair-wise terms in volume integral and
    pharmacophore feature score.
  • pKi values are used as initial weights then the
    weights are normalized so that the average weight
    is 1.

18
Examples
  • Examples use standard protocol
  • Steady state operator-based GA
  • Selection pressure of 1.001
  • Score Weights
  • Donor Hydrogen Weight 1500
  • Acceptor Atom Weight 1750
  • Aromatic Ring Weight 2500
  • Volume Weight 10
  • Energy Weight 10
  • 60000 operations (crossovers 47.5, mutations
    47.5 and migrations 5)
  • Island model- 5 populations of 100 chromosomes
  • Input structures minimized and randomized
  • Best guess at atom protonation
  • 10 runs per test system (avg run time 256s)
  • Display best scoring system.

19
5-HT3 Antagonists
20
Angiotensin II Antagonists
21
5-HT2A Antagonists
22
Incorporation of Activities
8.76 8-OH_DPAT7.94 Buspirone7.56
Spiperone7.16 MCPP5.75 Quipazine5.38
Remoxipride
http//www.gpcr.org 5HT1A pKi human activity
23
Overlay with Activities
Overlay without Activities
24
Examples from PDB
  • A comparison of the pharmacophore identification
    programs Catalyst, DISCO and GASP Patel,
    Gillet, Bravi Leach JCAMD V16 (2002) 653-681.
  • DHFR 1boz, 1drf, 1ohk, 1dlr, 1hfp, 2dhf
  • CDK2 1aq1, 1di8, 1e1v, 1e1x, 1fin, fvv
  • Thrombin 1c4v, 1d4p, 1d6w, 1dwd, 1fpc
  • HIV RT 1bqm, 1tvr, 1ddt, 1fk9, 1rt5, 1rt1, 1ep4,
    1klm, 1rt3, 1vru
  • Thermolysin 1hyt, 4tmn, 5tmn, 1qf1, 5tln, 7tln

25
DHFR
26
DHFR
27
Thrombin
28
Thrombin
29
CDK 2
30
CDK 2
31
HIV RT
  • Shared pharmacophore 1bqm, 1tvr, 1ddt, 1fk9,
    1rt5, 1rt1
  • Others 1ep4, 1klm, 1rt3, 1vru.

32
HIV RT
33
HIV RT
34
Thermolysin
35
Thermolysin
36
Pharmacophore Search
  • Query molecules are held rigid.
  • Query can be the output of a GAPE overlay.
  • Database of target compounds sequentially matched
    using GAPE against queries.
  • Sort alignments using GAPE score
  • Test on 20 5HT2A ligands, overlaid on a
    conformation of Sarpogrelate.
  • Used in-house on datasets of 800 compounds.

37
5HT2A Example
38
Rigid Pharmacophore Search
  • Query pharmacophore and structure(s) from GAPE.
  • Use clique detection to screen conformations.
  • Rank structures that pass using GAPE feature and
    volume score.
  • Example 5HT3 query. Database of 18M conformers
    (ChemNavigator collection).
  • 650 conformers/second.
  • Proximity filter.
  • Work in progress.

39
5-HT3 Example
40
Conclusions
  • The GA is a good method for molecular overlay and
    pharmacophore generation.
  • Development of chromosome representation and
    scoring function required considerable effort.
  • Believe that GAPE is a significant improvement
    over GASP.
  • Pharmacophore features do not have to be present
    in all molecules.
  • Incorporation of activities.
  • More improvements.

41
Acknowledgements
  • Paul Watson, Carleton Sage, Runtong Wang Arena
    Pharmaceuticals.
  • Val Gillet, Simon Cottrell, University of
    Sheffield, UK PDB test systems.
  • Robin Taylor, CCDC GOLD torsional library.
  • PyMol
  • AstexViewer
Write a Comment
User Comments (0)
About PowerShow.com