Title: The Basic Technology Research Programme
1The Basic Technology Research Programme
- Proof of Concept Studies Consortia Building
Networks
2Background
- Cross research council endeavour
- administered by EPSRC
- Funding for research to create a new technology
- Change the way we do science
- Underpin the future industrial base
3Background
- 15 research projects funded up to April 2003
- Total funding for this period - 41M
- To support large, long term, high risk, high
impact research consortia - Encourage investigation of speculative ideas
4Background
- Two levels of funding
- One year start up
- Full grant up to five years
- Two types of start up funding
- Proof of concept
- Consortia building networking
5Proof of Concept Studies
- One year funding up to 100K
- Research to investigate feasibility of developing
the new technology - Output a business case for the next step of
investigation to be submitted in May 2004 - Basic Technology Programme
- Existing Research Council initiatives
- DTI programmes
6Consortia Building Networks
- Involvement of the users of the new technology at
a very early stage - Funding to form networks hold workshops
7ParaSurf in silico Screening Technology
- Basic Technology Funding for October 2003 to
September 2004 - Proof of concept
- Consortia building networking
- Academic partners
- University of Portsmouth
- University of Erlangen
- University of Southampton
- University of Oxford
- University of Aberdeen
8ParaSurf Proof of Concept Research Programme
- Development of techniques to describe irregular
solids surfaces - Development of projection pattern recognition
techniques for non-planar colour-coded surfaces - spherical harmonics, molecular topology
- Conformational analysis
- Rigid body dynamics incorporating surface
features - rigid parts of molecule treated as anisotropic
solids linked by rotatable bonds - Investigate how best to generate prediction
models using surface properties that define a low
dimensional chemical space - QSAR, pattern recognition, artificial
intelligence, analysis of surfaces - Bench marking using Grid computing
9ParaSurf Proof of Concept Research Programme
10Potential applications of the in silico screening
technology
- High throughput virtual docking
- Physical property mapping
- ADMET prediction
- Long time-period simulation techniques
- Crystallisation and solubility
- Prediction of tautomers
- Chemical reactivity and metabolism
11(No Transcript)
12ParaSurf Progress Report
- Letchworth, 16th March 2004
13Main Areas
- Molecular Surfaces and Property Calculation
- RGB Encoding Pattern Recognition
- Conformational Analysis
- Rigid Body Molecular Dynamics
- Analysis of Variables QSAR models
- Grid Computing
- Consortium Building
14Datasets
- Small
- Consensus Set of 74 Drug Molecules (diverse)
- QSAR set (31 CoMFA steroids)
- Medium
- WDI subset (2,400 comps)
- Harvard Chembank dataset (2,000 comps)
- Large
- WDI (50,000)
- Maybridge (50,000)
15Example Molecule
Allopurinol
16Surface Definition Local Property Calculation
17Calculations
- 3D co-ordinates from CORINA
- QM calculations with VAMP
- Local Properties and surfaces from ParaSurf
18ParaSurf v1.0
- Surfaces
- Isodensity Surfaces
- Shrink Wrap
- Marching Cube
- Surfaces fit to Spherical Harmonics
- Properties
- MEP, LIE, LEA and LP
- Encoded at points on the surface
- Encoded as Spherical Harmonic Expansions
19Small molecule
20RGB Encoding Pattern Recognition
21RGB Encoding
- Each Local Property encoded as a colour
- LIE encoded on Red channel
- LEA encoded on Green Channel
- LP encoded on Blue Channel
22Allopurinol RGB Surface
23RGB Encoding
- Alternative Encoding
- LIE
- LEA
- Absolute value of MEP
24Allopurinol RGB Surface
25Conformational Analysis
26Conformational Analysis
- Efficient All Atom MD analysis (DASH)
- Treated as time series (not Cluster Analysis)
- Scales linearly with simulation length
- No need for arbitrary choice of number of
clusters - Can be analysed using Markov Chain methodology
27MD studies of Rosiglitazone
28(No Transcript)
29Rigid Body Molecular Dynamics
30Rigid body molecular dynamics
- Well founded methodology e.g. CNS / XPLOR (Axel
T. Brunger, Stanford University) - Idea is to use rigid groups to model flexibility
- In the ligand
- and the protein binding site.
- Allows time-steps of 10fs to 20fs.
31QSAR models
32Distribution of Properties
33Correlation Matrix
34Descriptors
- 34 descriptors based on Normal Distribution
- Principal Components
- Spherical Harmonic Co-efficients
35Descriptors for LIE
36Other Descriptors
- Moments
- Order 1 Mean
- Order 2 Variance
- Order 3 Skewness
- Order 4 Kurtosis
- Overlapping Gaussians
- Derived from previous work on MD analysis
37QSAR models
- Models derived from Local Properties
- Surface Integral Model for Solvation Energy
- RMS Error 0.75 Kcal
- Drug Likeness
- SOMs trained on WDI (drugs) Maybridge (general)
- Parameters from PC of Local Property Descriptors
- Medium sized datasets superimposed on SOMs
38(No Transcript)
39(No Transcript)
40GRID Computing
41GRID Computing
- ParaSurf compiled on
- SGI IRIX
- Windows
- Linux (SUSE)
- IBM AIX
- Future Platforms
- SUN Solaris
- GRID enabling at Portsmouth (Mark Baker),
Southampton and Oxford.
42Provisional Timings
- SGI R10k, 256MB
- VAMP 30s/compound
- ParaSurf 10s/compound
- Intel 1.8 Xeon/ AMD Athlon XP-2000
- ParaSurf 2s/compound
- SGI FUEL Workstation R14K
- ParaSurf 2s/compound
43Conclusions
44Conclusions
- Properties can be calculated
- Properties can be RGB encoded
- Properties are local
- Properties can be used for QSAR models
45(No Transcript)
46Computer vision methods for comparing molecular
surfaces
- Comparing and recognising 3D objects is an active
research area in robotics and AI. - Fast methods have been developed for database
indexing. - Rotationally invariant descriptors of 3D objects
are possible.
47Pattern matching on molecular surfaces
- Can we recognise similar surfaces?
- Can we recognise similar surface fragments?
- Can we identify the most similar surface to our
target? - How do we compare field descriptors on the
molecular surface?
48Rotationally invariant 3D object descriptors
- Internal coordinates e.g. a distance matrix.
- Energy distributions based on the spherical
harmonics. - The spherical harmonic coefficients.
- Radial integration, radial scanning, and
invariant moments.
49Surface comparison
- Two different approaches
- Using spherical harmonic molecular surfaces J.
Comp. Chem. 20(4) 383-395 Ritchie and Kemp 2000
University of Aberdeen. - Partial molecular alignment via local structure
analysis J. Chem. Inf. Comput. Sci. 40(2)
503-512 Robinson, Lyne and Richards 1999
University of Oxford.
50An example grid of surface points
A grid is placed on a ParaSurf surface in order
to reduce the number of surface points from 4038
to 55.
51Partial molecular alignment
- We do not know which points on the two surfaces
need to be aligned with each other. - The essential approach is
- all surface points on one surface are compared
with all points on the other. - For two surfaces, with M and N points, MN
possible alignments are possible - we want to reduce this large search space!
52Voting pairs are possible alignments
The voting pairs can have a critical effect on
the quality of the surface alignment.
53The voting table
- A voting table may list all matching pairs of
surface points (i.e. all possible alignments). - A smart editing of votes within the voting table
can enable speed and accuracy. - We want to only consider alignments between
similar local features on the surfaces. - The more false votes we have in the voting table
the harder it is to find the optimum alignment.
54A distance matrix can be used to describe local
surface features
P1
The internal distance matrix can be used to
distinguish between surface points.
P3
By comparing rows and columns from distance
matrices of different surfaces we can detect
similar surface features.
P2
55Selecting the voting pairs
- Similar local features, or interest points, on
the molecular surface can be identified using a
distance matrix. - For a point on each surface
- Arrays of internal surface point distances are
calculated for both points i.e. dist1, dist2. - After a crude alignment, the absolute difference
of dist1 and dist2 indicates the similarity
of this pair of points.
56Scoring the possible alignments
- The optimum alignment is composed of a rotation R
and a translation T. - Apply the current rotation r
- Score the translation vectors t p q of all
voting pairs (p,q) using a gravitational
potential - High potentials identify clusters of similar
translation vectors. - The vector with the highest potential is the
optimum translation T. - Scoring all r gives R and T.
57Scoring with a gravitational potential
Translation vectors (x,y coordinates plotted)
Some voting pairs for example rotations
58Can we use the potential to compare aligned
structures?
59Can we get better alignments with more voting
pairs?
60Example alignments
4
3
2
1
61Example 1 RMSD 0.75
A
B
62Example 2 RMSD 1.05
A
B
63Example 3 RMSD 1.20
A
B
64Example 4 RMSD 1.89
A
B
65Matching with the surface field descriptors
example 1
- Surfaces are aligned (using a quick search
method e.g. 45º rotations). - Best N alignments are selected.
- Each alignment is gently perturbed and optimised
using the field descriptors.
66Matching with the surface field descriptors
example 2
- Align using the field descriptors values to
identify suitable voting pairs - only match on similar field descriptors.
- Filtering can be achieved by aligning the fields
separately. - More accurate alignments can be generated by
combining field values.
67Parameterisation
- Voting pairs
- The distance between points in surface grid.
- The number of voting pairs.
- Identifying and selecting local features.
- How to represent the fields at interest points.
- Scoring
- Scoring function to identify the correct rotation
and translation (e.g. gravitational potential). - Target function to compare different surface
alignments (e.g. RMSD). - Optimising the alignments.
68(No Transcript)
69Molecular Surface Property Graphs
Characterize the behaviour of a property f S ?
? on a molecular surface S, in terms of a
directed graph G on S derived from the gradient
vector field x ? grad f(x) Vertices (G)
fixed points of grad f ( critical points of f
). Edges (G) stable and unstable manifolds of
the saddle points.
70Gradient Flow
71Molecular Surface Property Graph
72Applications
- Similarity
- Pattern recognition methods
- Maximal common subgraphs
- Complementarity
- Compare ligand graph with graph induced on
ligand - by receptor
- QSAR
- Topological indices
73Example
- S Connolly Surface
- f(x) Electrostatic Potential ? q(i) / d(x,i)
- Method
- Locate critical points of f (Newton-Raphson).
- Linearize at saddles, find eigenvectors of
Hessian( f ). - Integrate gradient vector field forward in time
from 2 points on - unstable eigenvector, backward in time from 2
points on stable - eigenvector (Runge-Kutta).
- Integrate to boundary of Connolly surface patch,
then continue - on adjacent patch until reaching another
critical point.
74Allopurinol
8 maxima 7 minima 13 saddles
maxima saddles minima ? (S) 2
75Work in Progress
- Implementation for
- S spherical harmonic surface
- f MEP, LIE, LEA and LP
- Use images of triangulation points as starting
points for Newton-Raphson search for critical
points. - Automatic differentiation.
76(No Transcript)
77Summary
Compound screening
Spherical harmonic representation Dave Ritchie
78Future directions
- High-throughput ligand docking
- Superimposition of ligand and a negative of the
receptor - Use of the fields to drive simulation
- Use of the fields to derive intermolecular forces
- Rigid-body motions long time-step MD
- Free energy calculations
79A hierarchy of methods
- Rapid screening using computationally fast
approaches - 3D fields Andy Vinter
- On reduced set
- Semi-empirical property calculations and
alignments - On most interesting molecules
- Density-functional or ab-initio calculations and
alignment - More accurate molecular representations are used
as appropriate, as resources allow