Title: SAESAR Shape And Electrostatics in SAR
1SAESARShape And Electrostatics in SAR
Norah E. MacCuish, Anthony Nicholls, John D.
MacCuish
CUP V Santa Fe, NM, Monday, March 1, 2004
OpenEye Scientific Software, Inc
2Introduction
- Shape and Electrostatics What role do they play
in determining Structure Activity Relationships ? - Is there a protocol using OpenEye and Mesa
software that can be applied to drug design
problems which takes advantage of Shape and
Electrostatics?
3Outline
- Shape Electrostatics - generation and
comparisons - Types of problems which will be discussed
- Bound ligand data- Xray (Protein Data Bank)
versus assay ligand data (Wombat Database
Sunset Molecular Discovery, LLC ) and decoys - Cox2
- Progesterone
- Assay Ligand Data (Wombat Database) versus decoys
- Dopamine
- Ca Ion Channel
- Wombat Database contains ligands with published
activities across many receptor types.
4Tanimoto Measure For Shape and Electrostatics
- Tanimoto Shape Comparison for a,b volume
overlaps - Tab Oa,b/(Ia Ib- Oa,b) range (0,1)
- Tversky Shape Comparison for i,j (subshape)
- Ta,b Oab/(aIa bIb - Oab) range(0,1)
- Tanimoto Electrostatic Comparison for a,b field
overlaps
range(-1/3,1) (Using MMFF charges, continuum
solvent)
OpenEye Scientific Software
5Shape Tanimoto Example
Tanimoto 0.874
ROCS, Shape Toolkit
OpenEye Scientific Software, Inc
6Does a Maximum Shape Match Assure a Maximum
Electrostatic Match? NO! (EON Spin)
xray structure
Shape Tanimoto 0.956 Electrostatic Tanimoto0.108
Shape Tanimoto0.942 Electrostatic Tanimoto0.293
Conformers of an active ligand - Omega
Electrostatic Overlay - Eon
Eon Omega
7Electrostatic Value Conformer Choice Based on
Geometric Mean (of Electrostatic Tanimoto and
Shape Tanimoto) or Maximum Shape
Geometric Mean Maximum Shape 40 of the time
60 of the time, using the Geometric Mean the
Electrostatic Tanimoto larger with only a small
change in Shape Tanimoto
8Conformational Energy Analysis using Xray
structure and Wombat HIVRT ligands
- All conformer similarity at 3kcal, 5kcal, 10kcal
- Geometric Mean conformer similarity at 3kcal,
5kcal, 10kcal - For an xray structure of a HIVRT ligand versus
ligands active for HIVRT from Wombat.
Xray structure
9HIV-RT Actives All conformer similarities at 3,
5, 10 kcal
10HIV-RT Actives Geometric Mean conformer
similarity at 3,5 and10 kcal
11Electrostatic Tanimoto vs RMS from Crystal
Structure After aligning to MKC-442 (HEPT
analogues)
1rt2
Poor conformation (Shape Tani lt 0.6)
Conformation match (Shape Tani gt 0.88)
1rti
1c1b
MKC-442
1c1c
1rti
1c1b
1c1c
1rt2
12Crystal Structure of Target with bound ligand vs
reported actives and decoys
- Generate conformations of Wombat ligands at 5kcal
with Omega - Transform conformer space and electrostatic space
to single conformer and electrostatics per
structure vs. xray structure (Using Geometric
Mean) - Combine with 2D structure space
- Develop simple model against decoys, validate
- Variables (shape, electrostatics, 2D (320 MACCS
keys))
1343 Cox2 Ligands - Highly Active Structures from
Wombat Shape and Electrostatic Tanimoto
Similarities to Cox Crystal Ligand
Xray structure Cox2 SC-558
14Decoys plus Highly Actives Shape and
Electrostatic Tanimoto Similarities to Cox
Crystal Ligand
Xray structure Cox2 SC-558
15Decoys plus Highly, Moderately, and Weakly
Actives Shape and Electrostatic Tanimoto
Similarities to Cox Crystal Ligand
Xray structure Cox2 SC-558
16Cox2 Classification Error
class means
0.81 STD of error 0.03 Classification error
Geo-mean2 For uneven Class sizes
100 fold Cross validation Testing set Randomly
sampled Data Divided into 3/5 Training, 2/5
Testing set
Linear Decision Boundary (Fishers Linear
Discriminant)
17Enrichments and Classification Error
18Progesterone Receptor with bound Progesterone
FASP
19Progesterone Study
20Progesterone Receptor
FASP
21Group Average Clustering of All (100 SMILES)
Progesterone Actives Cut made at 0.68 similarity
to pick up 25 of SMILES in Cluster and 65 of
SMILES in Cluster B
Grouping Module Shape Module
12 remaining outliers
0.68 Similarity
Cluster A (24 SMILES)
Cluster B (64 SMILES)
Progesterone
Cluster A centroid
Cluster B centroid
22Cluster A (Conformers chosen via the geometric
mean of Shape and ET values)
Cluster B (Conformers chosen via the geometric
mean of Shape and ET values)
23Cluster A and B Centroids vs Wombat Decoys and
Cluster Members
Cluster A Centroid
Cluster B Centroid
24Cluster A, comparing 2D Tani with ET and Shape
Cluster B, comparing 2D Tani with ET and Shape
25Decoys plus Highly, Moderately, Weakly
Actives Shape and Electrostatic Tanimoto
Similarities to Cox Crystal Ligand
Xray structure Cox2 SC-558
(When actives in outlier region are removed the
Geo-mean error 0.95)
26Cox Outlier Study
- Outliers
- Moderately High Shape Similarity to COX2 SC-558
- No Electrostatic Similarity to SC-558
- Found Strong Shape (gt 0.75) Cluster that Covers
all Structures - Similarity to Centroid (or representative
compound) of Cluster Shows - Most have moderate to strong ET similarity
- Wide Spread of 2D Structure.
27Xray Structure compared with outlier cluster
centroid
COX2 SC-558
Centroid
28Outlier Cox Ligands versus Outlier Centroid
29Cox2 Outlier Ligands vs. Outlier Centroid 2D
structure analysis
30Summary of xray data analysis
- Shape and Electrostatics from xray queries will
classify actives versus decoys - Shape clustering provides a way of exploring
anomalies, other binding modes, etc. unexplained
by the classification model
OpenEye Scientific Software, Inc
31Dopamine D2 Ligands
- Clustering of 28 active structures (4188
conformers) with Taylor-Butina clustering at 0.7
Tanimoto threshold. - First three largest clusters contain 20 (455
conformers), 18 (210 conformers), and 18 (185
conformers) structures respectively, but the
union of these three clusters contained 22 of the
28 structures, thus much overlapping. - 1000 Wombat Decoys were used against each
centroid for each of the three clusters.
32Dopamine D2 Cluster 1 Centroid vs Highly Actives
and Wombat Decoys
3 active structures
Centroid
33Dopamine D2 Cluster 2 Centroid vs Highly Actives
and Wombat Decoys
2 active structures
Centroid
34Dopamine D2 Cluster 3 Centroid vs Highly Actives
and Wombat Decoys
1 active structure
Centroid
35Wards Clustering of 14 Ca Ion Structures with
185 Conformers
All but one structure represented
Remaining conformers from just one structure,
not represented in group on left
Cluster 1 One centroid 6 other compounds
Cluster 3 One centroid 3 other compounds
Cluster 2 Just 1 compound
Cluster 4 Just 1 compound
36Example Shape Comparison with Tversky
Tversky .974
Tversky .974
Tversky .872
Tversky .872
How well does A fit into B and How well does B
fit into A
37Ca Ion Channel Overlay Result
Representative Shape
One finding - An Inflexible conformer which is
contained in at least one conformer for every
SMILES in the input dataset
Shape Overlay for Flexible Ligand Conformers
with Tversky gt.85
Investigations in shape clustering for lead
hopping, NE MacCuish, JD MacCuish, 226 ACS New
York Sept. 10, 2003
38Ca Ion Channel Subshape and Electrostatic
Overlays
ETani.18 STversky0.99 Shape Tani0.78 2D
Tani0.58
ETani.15 STversky0.91 Shape Tani0.71 2D
Tani0.40
ETani.18 STversky0.99 Shape Tani0.76 2D
Tani0.62
ETani.55 STversky0.96 Shape Tani0.85 2D Tani
0.89
39Conclusions
- Shape and Electrostatics are discriminating
descriptors for predictive modeling of activity
relationships - A protocol which combines OpenEye and Mesa
software can facilitate SAR analysis - Starting with SMILES, 3D shape and electrostatic
patterns can be derived - Shape, partial shape, and electrostatics are
effective tools - Negative information is significant.
- Future work more receptors, clustering on both
shape and electrostatics
OpenEye Scientific Software, Inc
40Acknowlegments
- OpenEye Scientific Software
- Roger Sayles
- Geoff Skilman
- Bob Tolbert
- Stanislaw Wlodek
- Jeremy Yang
- Tudor Oprea Sunset Molecular Discovery, LLC
- Protein Data Bank