Shape and ROCS - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Shape and ROCS

Description:

Basic ROCS usage. Using VIDA to visualize results. Hands-on examples. Using the Color Force Field ... Minimum example ... A couple of simple examples ... – PowerPoint PPT presentation

Number of Views:275
Avg rating:3.0/5.0
Slides: 30
Provided by: robertt63
Category:
Tags: rocs | shape

less

Transcript and Presenter's Notes

Title: Shape and ROCS


1
Shape and ROCS
  • Bob Tolbert
  • OpenEye Scientific Software, Inc.
  • ACS - NYC
  • September 10th, 2003

2
Agenda
  • Shape Theory
  • Basic ROCS usage
  • Using VIDA to visualize results
  • Hands-on examples
  • Using the Color Force Field
  • Hands-on examples with color

3
Gaussian Description of Shape
  • Atoms are represented as Gaussians instead of
    hard spheres
  • Easily integrable, analytic derivatives
  • Product of 2 gaussians is another gaussian
  • Easily calculate overlap between two atoms
  • Easily calculate overlap between two collections
    of gaussians
  • Shape Tanimoto and Tversky as measure of
    similarity

4
What is shape tanimoto?
  • shape specific version of tanimoto that uses 3D
    overlap instead of bits for whats in common.
  • Formula
  • Sensitive to large size differences between two
    structures

5
What is shape Tversky?
  • Tversky is an alternate similarity measure.
  • Introduced for 2D similarity by John Bradshaw at
    Daylight
  • Has weighting factor to deal with size
    differences.
  • Useful for small structure vs large structure
    similarity

6
Tversky
  • Tversky is assymmetric with ? 0.95 and ? 0.05
  • ROCS reports 2 values Tversky-d and Tversky-q
    with the 0.95 weighting for query molecule and
    database molecule, respectively

7
What is ROCS
  • Rapid Overlay of Chemical Structures
  • Small molecule vs. small molecule
  • Uses
  • Scaffold jumping
  • HTS Rescue
  • Patent busting

8
ROCS Status
  • Version 2.0 announced this week
  • Completely rebuilt on top of OEChem
  • Platforms include Linux, Linux-IA64, HPUX,
    MacOSX, AIX, Tru64(Alpha), Solaris, Win32 and
    Irix.
  • PVM support for all platforms except Win32

9
ROCS in actual use
  • How many 'hits' has ROCS helped to identify?
  • 15-20, 4 classes, 15-20, 1, 1, 1, 50-75
  • How many Targets attempted?
  • 5, 4, 3, 1, numerous
  • What are typical hit rates?
  • 1-2, 6-15, 5-10, 2-5
  • How many hits have made it into chemistry
  • 0, 1-2, 3, 1, 1, 1
  • How many HTS programs have been cancelled?
  • gt1?

10
Minimum example
  • ROCS requires only 2 inputs, a file containing
    shape query molecules and a file containing the
    database of interesting structures, i.e.
    corporate database, ACS-Screen, vendor databases
    etc.
  • rocs -dbase vendor.oeb.gz -query 6cox.mol2

11
The query file
  • Can be one or more structures, each with one or
    more conformers. By default, each
    conformer/structure is treated as a separate
    query
  • File format can be any of SDF, MOL2, PDB, XYZ,
    MMOD, OEB
  • Molecule(s) must already be 3D.

12
The database file
  • Normally a pre-generated multi-conformer file.
  • Can be one of several formats SDF, MOL2, PDB,
    XYZ, MMOD, OEB (OEBinary v2) or BIN (OEBinary v1)
  • Contiguous conformers in file like SDF or MOL2
    will be combined into a single, multi-conformer
    molecule (by default)
  • OpenEyes tool of choice is OMEGA for generating
    dbase files.

13
Other default settings
  • -besthits 500
  • number of results to keep in hitlist
  • -cutoff 0.0
  • minimum score to even consider
  • -rankby tanimoto
  • alternates tverskyd, tverskyq, scaledcolor,
    combo
  • -prefix rocs
  • text prefix for output file names
  • -oformat sdf
  • format of output structure file

14
Initial orientations for optimization
  • Default - Inertial Frame alignment
  • Overlay COM of query and target
  • Large MOI aligned, then second largest. Including
    2-fold degeneracy of each yields 4 starting
    points.
  • Extra 2 or 4 axes for top symmetry.
  • Random
  • -randomstarts N

15
A couple of simple examples
  • rocs -dbase spam.bin -query acetsali.sdf -prefix
    ACET -cutoff 0.5 -besthits 100 -outputquery
  • rocs -dbase spam.bin -query aminopy.sdf -rankby
    tverskyd -cutoff 0.4 -maxhits 10 -outputquery

16
Using VIDA to visualize results
  • ROCS outputs (by default) two main files, a
    structure file and a report file.
  • Structure file is by default SDF, scores are
    stored in SD tags and automatically loaded into
    VIDAs spreadsheet
  • Report file is tab-delimited text, can be loaded
    into any spreadsheet for analysis or parsed by
    splitting each line on tabs.

17
Report File Format
  • Tab-delimited
  • Fields
  • Name
  • ShapeQuery
  • Rank
  • ShapeTanimoto
  • Scaled Color
  • ComboScore
  • ColorScore
  • SubTan

18
Hands-on examples
  • Example molecules
  • Several example molecules in data directory
  • spam.bin is example multi-conformer database
  • Documentation is available in HTML and PDF format.

19
Color Force Field
  • Use SMARTS to describe color atoms or groups of
    atoms
  • Post-shape scoring
  • Color gradients can be used as part of
    optimization process

20
Color Force-field definition
  • Define some atom types

TYPE donor TYPE acceptor TYPE cation TYPE
anion TYPE rings
21
Color Force-field
  • Define patterns (SMARTS) that match the types

These definitions of donor and acceptor are the
general definitions of Mills Dean, JCAMD
10607-622, 1996. Donor an electronegative
atom with a proton (no S or C, see
above). PATTERN donor
7,8h,H Acceptor a lone pair on an
electronegative atom (O or N S was removed,
see reference). Note, N in an amide or in an
alkyl-aniline system is too conjugated to
accept, however, analinic NH2 is a potential
acceptor. PATTERN acceptor
OD1(O-,,6,15,16)!(1,2,3) PATTERN
acceptor nH0,N,8!(nX3)((-,
,ee)-,,ee)! (NC)!(ND2,D3-a)!(
1,2,3) PATTERN acceptor
NX3((-,,ee)(-,,ee)-,,ee)!(
1,2,3)
22
Color Force-field
  • Define interactions between types
  • Weight is strength of interaction, relative to
    shape gradients.
  • Radius affects range of interaction

INTERACTION donor donor attractive gaussian
weight1.0 radius1.0 INTERACTION acceptor
acceptor attractive gaussian weight1.0
radius1.0 INTERACTION rings rings attractive
gaussian weight1.0 radius1.0
23
Example Force-field
  • ROCS includes a very simple color force field
    simple.cff
  • Also included is a more complete force field that
    include Mills-Dean definitions of donors and
    acceptors as well as types for rings, anions and
    cations. (MillsDean.cff)

24
Using Color
  • To just score shape hits
  • rocs -dbase spam.bin -query aminopy.sdf -chemff
    MillsDean.cff
  • To use color gradients
  • rocs -dbase spam.bin -query aminopy.sdf -chemff
    MillsDean.cff -optchem
  • To also rank by color
  • rocs -dbase spam.bin -query aminopy.sdf -chemff
    MillsDean.cff -optchem -rankby scaledcolor

25
Two extra scores with Color
  • Scaledcolor
  • Actual color score is sum of each best color
    interaction. Scaled color divides actual score
    into self-color of query giving score between 0
    and 1.0
  • Comboscore
  • To use shape and color together for ranking,
    comboscore is the sum of shape tanimoto and
    scaled color giving a score between 0.0 and 2.0

26
Advanced Features
  • -maxconfs
  • retain more than one db molecule conformer
  • -scdbase
  • treat dbase as single conformer, separate
    molecules
  • -report each, one, none
  • -stats hits, best, all
  • -nostructs
  • dont write structure file at all
  • -pvmconf, -pvmpass

27
Roadmap I
  • MOCS
  • Multiple Overlay of Chemical Structures
  • result is grid for query
  • General grid query
  • Pharmacophore pre-screening
  • User-directed starting positions

28
Roadmap II
  • ElectroROCS
  • Electrostatic tanimoto
  • Electrostatic gradients
  • Shape fingerprint pre-screening
  • Torsion tweak with MMFF
  • Query Optimization
  • Database Optimization

29
Acknowledgements
  • Anthony Nicholls - OpenEye El Presidente
  • author of Shape toolkit
  • The OEChem hive-mind
  • Geoff Skillman, Roger Sayle, and Matt Stahl
  • Hewlett Packard
  • computer loans for booth and seminar
Write a Comment
User Comments (0)
About PowerShow.com