Novel Approaches To Molecular Similarity - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Novel Approaches To Molecular Similarity

Description:

For reviews see e.g.: Bender, A. and Glen, R.C., Org. Biomol. Chem., 2004, (2) 3204 3218. ... Car, Car, Car. 1. 2. 1. 1. These features are created for every ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 31
Provided by: andreas83
Category:

less

Transcript and Presenter's Notes

Title: Novel Approaches To Molecular Similarity


1
Novel Approaches To Molecular Similarity
  • Andreas Bender, ab454_at_cam.ac.uk
  • Unilever Centre for Molecular Informatics,
  • University of Cambridge, UK

2
Outline
  • Molecular Similarity What is it and why is it
    relevant?
  • Our approach to molecular similarity Finding
    many active compounds vs. generalizability?
  • How to compare molecules
  • The algorithm
  • Results
  • Special Feature Discovery of Binding Patterns

3
Similarity Searching
  • Complementary approach to substructural searching
  • In substructure searching exact retrieval of a
    subgraph of a molecule is performed
  • In similarity searching, an abstract molecular
    representation in descriptor space is calculated
    which is compared to abstract representations of
    other molecules
  • For reviews see e.g.
  • Bender, A. and Glen, R.C., Org. Biomol. Chem.,
    2004, (2) 3204 3218.
  • (freely available from www.cheminformatics.org)

4
Molecular Similarity The way it should be and
the way it is
  • The God of Molecular Similarity (according to
    Google Picture Search)
  • The Molecular Similarity Principle
  • Small structural changes cause small property
    differences
  • Basis of all current structure-property
    predictions
  • A is active, B is not how about C?
  • Solubility of A is known how does it change if
    we add group X here?

5
Does the Molecular Similarity Principle work (1)?
6
Does the Molecular Similarity Principle work (2) ?
7
The importance of shape
Slides courtesy of Hugo Kubinyi, Erlangen
Lectures see http//www.cheminformatics.org -gt
Links -gt Education
8
Is it possible and sensible to define molecular
similarity?
  • YES, but one needs to be careful
  • Similarity depends on the Context (e.g. the
    particular receptor easy in case of
    non-directional properties, e.g. logP)
  • Similar changes may have different (even
    detrimental!) effects, depending on system
  • Chiral molecules may have totally different
    activities sometimes problematic to capture

9
Our approaches to molecular similarity
  • How to describe and compare molecules
  • Description of the system in a suitable form
  • Selection of important features
  • Model generation and prediction
  • Results
  • Special Feature Discovery of Binding Patterns

10
Descriptor Choice
11
2D Environment around an atom (MOLPRINT 2D,
a.k.a. Atom Environments)
  • E.g. 6-aminoquinoline

Assign Sybyl mol2 atom types find
connections find connections to
connections create a tree down to n levels bin
the atom types for each level create a
fingerprint for this atom
N2
Level 0 Level 1 Level 2
Car Car
Car, Car, Car
1
2
1
1
These features are created for every (heavy) atom
in the molecule (J. Chem. Inf. Comput. Sci. 2004,
44, 170-178 2004, 44, 1710-1718)
12
Feature Selection
  • E.g. comparing faces first requires the
    identification of key features.
  • How do we identify these?
  • The same applies to molecules.

13
B) Information-Gain Feature Selection
  • How can we select the active compounds (red)?

Red Active Green Inactive
?
?
?
14
C) Naïve Bayesian Classifier (classification by
presumptive evidence)
  • The next step is to identify which molecules
    belong to which class.
  • Example from e-mail classification (spam
    detection)
  • Training set from nerdy chemists inbox
  • Assign weighting factors to individual features,
    depending on relative frequencies in training set

15
Classification
  • To do this we use a Naïve Bayesian Classifier
    using the features (atom environments) we have
    identified as being important.
  • Ratio gt 1 Class membership 1
  • Ratio lt 1 Class membership 2
  • F feature vector
  • fifeature elements

16
Application lead discovery
  • If I have one active compound will I find
    another one?
  • Database MDL Drug Data Report (MDDR)
  • 957 ligands selected from MDDR
  • 49 5HT3 Receptor antagonists,
  • 40 Angiotensin Converting Enzyme inhib. (ACE),
  • 111 HMG-Co-Reductase inhibitors (HMG),
  • 134 PAF antagonists and
  • 49 Thromboxane A2 antagonists (TXA2)
  • 574 inactives
  • Briem and Lessel, Perspect Drug Discov Des
    2000, 20, 245-264.
  • Calculated Hit rate among ten nearest neighbours
    for each molecule

17
Comparison Single Queries
Using Tanimoto Coefficient
Using Bayesian
  • Grey bars Briem and Lessel, Perspect. Drug Disc.
    Des., 2000, 20, 245-264.
  • Black Bender, A., et al., J. Chem. Inf. Comput.
    Sci., 2004, 44, 1708 1718.

18
Combining Information of 5 Actives
Bender, A., et al., J. Chem. Inf. Comput. Sci.,
2004, 44, 170 178.
19
TXA2, Graph-based Descriptors
Query
1
2
3
4
5
6
7
Very little diversity in heterocyclic systems
no patents, no money!
20
3D Environment around a surface point solvent
accessible surface using local surface properties
Central Point (Layer 0)
Points in Layer 1
  • Points in Layer 2

Etc.
21
Overall Performance Comparable to 2D methods


Bender, A. et al., J. Chem. Inf. Comput. Sci.
2004 (44) 170 178.
22
TXA2, 7 Hits among Top 10 by MOLPRINT 3D
Query
1
2
3
4
5
6
7
23
Which features are selected for classification?
  • Even if your classifier works, do the selected
    features make sense?
  • Set of active vs. inactive molecules
  • Information Gain calculated for each feature,
    those which are much more frequent among actives
    are suspicious and might constitute the
    pharmacophore
  • Look at features from HMG and TXA2

24
Selected Features - HMG
  • Binding Site HMG rigid lipophilic ring

25
HMG-15
26
TXA2
Yellow lipophilic side chains
  • Yamamoto et al., J. Med. Chem. 1993 (36) 820

27
TXA2-44
28
TXA2-7
29
Summary
  • 2D Method Finds lots of active molecules but
    they are similar to what is known already
  • (Bender, A., et al., J. Chem. Inf. Comput. Sci.
    (2004) 44, 170-178 Bender, A., et al., J. Chem.
    Inf. Comput. Sci., 2004, 44, 1708 1718.)
  • 3D Method Find less active compounds but
    enables discovery of new chemotypes
  • (Bender, A. et al., J. Med. Chem., 2004, (47)
    6569-6583.)
  • Features shown to correlate with binding patterns

30
Acknowledgements
  • Robert C Glen (Unilever Centre, Cambridge, UK)
  • Hamse Y. Mussa (Unilever Centre, Cambridge, UK)
  • Stephan Reiling (Aventis, Bridgewater, USA)
  • David Patterson (Tripos)
  • Software
  • GRID, CACTVS, gOpenMol many, many others
  • Funding
  • The Gates Cambridge Trust, Unilever, Tripos
Write a Comment
User Comments (0)
About PowerShow.com