AMBIT Chemoinformatics Software for Data Management - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

AMBIT Chemoinformatics Software for Data Management

Description:

AMBIT Chemoinformatics Software for Data Management Joanna Jaworska Nina Jeliazkova P&G Brussels, Ideaconsult Ltd., Belgium ... – PowerPoint PPT presentation

Number of Views:136
Avg rating:3.0/5.0
Slides: 32
Provided by: Tatiana94
Category:

less

Transcript and Presenter's Notes

Title: AMBIT Chemoinformatics Software for Data Management


1
AMBIT Chemoinformatics Software for Data
Management
  • Joanna Jaworska Nina Jeliazkova
  • PG Brussels, Ideaconsult Ltd.,
  • Belgium Bulgaria

2
Introduction why Ambit ?
  • Limited free, publicly accessible,
    methodologically transparent software was
    identified as one of the roadblocks for
    broadening use of in-silico methods (ICCA
    Workshop in Setubal 2002, OECD)
  • Realization that efficient use of existing
    information on chemicals requires better ways for
  • Storage
  • standardized formats, computer automated
    verification of structures, capability to store
    large amounts of data
  • Taking advantage of rapidly evolving field of
    data mining and extraction of relevant information

3
IT strategy
  • Ambit - building blocks for Decision Support
    System
  • High emphasis on
  • interoperability for plug and play
  • Flexibility modular design
  • Transparency
  • Open source, relying on open standards. Open
    source software lowers the user barrier,
    facilitates the dissemination activities and
    enables the reproducibility of models and results
  • The cheminformatics functionality relies on the
    open source Java library The Chemistry
    Development Kit http//cdk.sourceforge.net/
  • The software is based on MySQL database
    (www.mysql.com), which is the most popular open
    source relational database.
  • Chemical Markup Language (CML)
  • acknowledged method of encoding chemical data in
    XML
  • Is being adopted by a large number of chemical
    organisations, from government, through
    commercial to academia.
  • The choice of CML for the internal format makes
    the database independent of the software which is
    able to access it, in contrast to some
    proprietary solutions.

4
Ambit - Overview
  • AMBIT software is a set of libraries and tools,
    providing various cheminformatics functionalities
    for data management.
  • The AMBIT system consists of a database and
    functional modules allowing a variety of flexible
    searches and mining of the data stored in the
    database.
  • The unique feature of AMBIT is the ability to
    store multifaceted information about chemical
    structures and provide a searchable interface
    linking these diverse components.

5
Ambit overview
  • The AMBIT database
  • stores chemical structures, their identifiers
    such as CAS, INChI numbers attributes such as
    molecular descriptors, experimental data together
    with test descriptions, and literature
    references. The database can also store QSAR
    models. In addition the software can generate a
    suite of 2D and 3D molecular descriptors.
  • can be searched by identifiers, attribute value
    or range, experimental data value or range, user
    defined structure and substructure, structural
    similarity
  • AMBIT database contains over 450 000 chemical
    compounds with data imported from over a dozen
    databases http//ambit.acad.bg/ambit/stats/.
    The number of compounds is growing all the time
    and one the of systems great strengths is that
    any dataset can be imported for comparison and
    analysis. AMBITDatabaseTools 1.10 allows the user
    to create a local database and to import his own
    sets of chemical compounds.
  • AMBIT Discovery performs chemical grouping and
    assesses the applicability domain of a QSAR
    offering a variety of methods including using
    different approaches to similarity assessments
    statistical that rely on descriptor space
    approaches based on mechanistic understanding
    and approaches based on structural similarity.
  • ToxTree ToxTree is a flexible user friendly
    application which integrates structure based
    (classification) schemes. Currently 3 schemes are
    available Verhaaar for fish toxicity, Cramer for
    human acute toxicity, BfR rules for skin
    irritation. ToxTree implements a plug-in
    mechanism, allowing to be extended by modules
    developed at a future time, without recompiling
    the application. ToxTree and AMBIT modules can be
    integrated one within another.
  • Toxmatch stand alone application for pairwise
    similarity assessments with intention for
    read-across.
  • QSAR database under development. Will store
    information in QMRF. Large effort on
    standardization

6
AMBIT Database Today
Not restricted to these datasets!Any dataset can
be imported. (e.g. DSSTox, AQUIRE, LLNA )
7
AMBIT Database Schema
8
Experimental results repository
9
Ambit database
  • Two user interfaces to the database
  • Online
  • Standalone
  • Online
  • a more restricted interface
  • Standalone
  • Full interface
  • Can be used for storing managing confidential
    data
  • Common
  • Can link with other databases and pull
    information via webservices

10
AMBIT database functionalities
  • Storage information about chemicals name and
    structure, descriptors, experimental data and
    QSAR models
  • Example with a tailored template BCF golden
    database LRI project ( EURAS) Q2 2007
  • QSAR database with QMRF ( ECB funded)
  • Conversion
  • Different computer formats of structure,
    CAS-structure
  • Calculation
  • Variety of descriptors
  • The available list is growing thanks to
    contributions to CDK
  • Search
  • identification search (CAS, SMILES, chemical
    name)
  • Descriptor search
  • Experimental data search
  • Substructure and similarity search
  • Complex searches with multiple criteria
    (standalone)

11
(No Transcript)
12
(No Transcript)
13
What kind of searches are desired ?
  • Detailed analyses for pairwise similarity
  • Similarity of a compound to compounds in the
    database
  • Similarity of a compounds to a reference set
  • Similarity of a set of compounds to compounds in
    the database
  • Grouping based on chemical class

14
Ambit online
  • Searching for basic information

15
AMBIT OnlineSimilarity search replace with
new search results !!!
16
AMBIT OnlineQuery result
17
Links to other databases(example KEGG)
18
Link to Aquire
19
Information about QSAR models
20
Ambit Database Tools 1.20Standalone
applicationavailable at http//ambit.acad.bg/down
loads
21
Ambit converter (Batch search)
  • Ambit converter can open CML, CSV, HIN, ICHI,
    INCHI, MDL MOL, MDL SDF, MOL2, PDB, SMI, TXT and
    XYZ file types
  • Ambit converter can save SDF, MOL, CSV, TXT,
    SMI file types.
  • CAS-SMILES conversion based on a database lookup
  • Descriptors calculation
  • Cramer rules,
  • Verhaar scheme

22
Ambit Database Tools 1.20
  • Import to Database
  • Compounds several file formats
  • Descriptors SDF, CSV, TXT
  • Experimental data SDF, CSV, TXT
  • QSAR models SDF, CSV, TXT
  • Database processing
  • Calculate SMILES/Fingerprints/Atom environments
    necessary in order to perform substructure and
    similarity search. Should be invoked after
    importing compounds into database
  • several file formats
  • Descriptors calculation
  • Distances calculation used to speed up distance
    between heavy atoms query

23
Ambit Database Tools 1.20
  • perform a CAS RN search in the database (submenu
    "Search -gt CAS RN search")
  • perform a SMILES search in the database (submenu
    "Search -gt SMILES")
  • perform a molecular formula search in the
    database (submenu ("Search -gt Molecular
    formula")
  • define structure,descriptor,distance-based and
    experimental data criteria and perform searches
    in the database database
  • Output
  • On screen
  • To file

The user can select between the different
datasets existing in the AMBIT database.
Subsequent searches will be performed only
within the selected dataset
24
AMBIT User InterfaceExample Search by structure
  • Exact search
  • Substructure search
  • Similarity search
  • Fingerprints
  • Atom environments

25
AMBIT User InterfaceExample Search by
descriptors
26
AMBIT User InterfaceExample Search by
experimental data
27
Similarity based on toxicity mechanismVerhaar
scheme Verhaar H.J.M., Van Leeuven C., Hermens
J.L.M.,Classifying Environmental Pollutants. 1
Structure-Activity Relationships for Prediction
of Aquatic Toxicity, Chemosphere, Vol.25, No.4,
pp.471-491, 1992
  • 34 rules
  • 5 classes
  • Class 1. Narcosis or baseline toxicity
  • Class 2 Less inert compounds
  • Class 3 Unspecific reactivity
  • Class 4 Compounds and groups of compounds acting
    by a specific mechanism
  • Class 5 Not possible to classify according to
    these rules

28
(No Transcript)
29
Chemical similarity assessment using the database
  • Exact substructure search based on 2D
  • Structural Similarity search (various methods)
  • Criteria on descriptors
  • Based on mechanistic understanding ( Verhaar
    scheme)

30
Another view on Similarity assessments with
Toxmatch and Discovery
  • Discovery
  • similarity to a set (summary representation)
  • Toxmatch
  • pairwise similarities
  • Similarity to a set (nearest neighbours)

31
Thank you
Questions?
Write a Comment
User Comments (0)
About PowerShow.com