QSAR Application Toolbox: Third Step Data Gap Filling ReadAcross by Molecular Similarity - PowerPoint PPT Presentation

1 / 76
About This Presentation
Title:

QSAR Application Toolbox: Third Step Data Gap Filling ReadAcross by Molecular Similarity

Description:

... by Molecular Similarity) ... for a target chemical by molecular similarity, ... SMILES (simplified molecular information line entry system) notation, and ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 77
Provided by: schu3
Category:

less

Transcript and Presenter's Notes

Title: QSAR Application Toolbox: Third Step Data Gap Filling ReadAcross by Molecular Similarity


1
QSAR Application ToolboxThird Step - Data Gap
Filling(Read-Across by Molecular Similarity)
2
Background
  • This is a step-by-step presentation designed to
    take the you through the workflow of the Toolbox
    in a data-gap filling exercise using read-across
    based on molecular similarity with data pruning.
  • If you are a novice user of the Toolbox you may
    wish to review the Getting Started document
    available at www.oecd.org/env/existingchemicals/q
    sar

3
Objectives-1
  • This presentation reviews a number of
    functionalities of the Toolbox
  • Entering and Profiling a target chemical,
  • Identifying analogues for a target chemical,
  • Retrieving experimental results available for
    those analogues, and
  • Filling data gaps by read-across.

4
Objectives-2
  • This presentation also introduces several other
    functionalities of the Toolbox
  • Use of the Flexible Track
  • Entering a target chemical by SMILES notation,
  • Identify analogues for a target chemical by
    molecular similarity,
  • Retrieve experimental results for multiple
    endpoints

5
Specific Aims
  • To review the work flow of the Toolbox.
  • To review the use of the six modules of the
    Toolbox.
  • To review the basic functionalities within each
    module.
  • To introduce the user to new functionalities with
    selected modules
  • To explain to the rationale behind each step of
    the exercise.

6
The Exercise
  • In this exercise we will predict the Ames
    mutagenicity potential for an untested compound,
    (n-hexanal) SMILES CCCCCCO), which is the
    target chemical.
  • This prediction will be accomplished by
    collecting a small set of test data for chemicals
    considered to be in the same category as the
    target molecule.
  • The category will be defined by molecular
    similarity, in particular Organic functional
    groups.
  • The prediction itself will be made by
    read-across.

7
Read-across the Analogue Approach
  • Read-across can be used to estimate missing data
    from a single or limited number of chemicals
    using an analogue approach.
  • In the analogue approach, endpoint information
    for a single or small number of tested chemicals
    is used to predict the same endpoint for an
    untested chemical that is considered to be
    similar.

8
Analogous Chemicals
  • Previously you learned that analogous sets of
    chemicals are often selected based on the
    hypothesis that the toxicological effects of each
    member of the set will show a common behavior.
  • For this reason mechanistic profilers and
    grouping methods have been shown to be of great
    value in using the Toolbox.
  • However, there are cases where the mechanistic
    profilers and grouping methods are inadequate and
    one is forced to rely on molecular similarity to
    form a category.
  • The Toolbox allows one to develop a category by
    using either organic functional groups or
    structural similarity.
  • Since there is no preferred way of identifying
    structural similarity the user is guided to use
    organic functional groups as a first option.

9
Side-Bar On Mutagenesis
  • Mutagens do not create mutations.
  • Mutagens create DNA damage.
  • Mutations are changes in nucleotide sequence.
  • Mutagenesis is a cellular process requiring
    enzymes and/or DNA replication, thus cells create
    mutations.

10
Tracks
  • After opening the Toolbox, the user has to choose
    between three use tracks (or workflows)
  • (Q)SAR Track
  • Category Track
  • Flexible Track
  • Since you are becoming more familiar with the
    functionalities of the Toolbox, select the
    Flexible Track.

11
Tracks and Workflow
12
Workflow
  • Remember each track follows the same workflow
  • Chemical Input
  • Profiling
  • Endpoints
  • Category Definition
  • Filling Data Gaps
  • Reporting

13
Chemical Input
  • Click on the Flexible Track.
  • This takes you to the first module, which is
    Chemical input.
  • This module provides the user with several means
    of entering the chemical of interest or the
    target chemical.
  • Since all subsequent functions are based on
    chemical structure, the goal here is to make sure
    the molecular structure assigned is the correct
    one.

14
Chemical Input Screen
15
Ways of Entering a Chemical
  • Remember there are several ways to enter a target
    chemical and the most often used are
  • CAS,
  • SMILES (simplified molecular information line
    entry system) notation, and
  • Drawing the structure.
  • Click on SMILES.
  • This inserts the window entitled Structure
    editor (see next slide).

16
Blank Structure Editor Screen
17
Entering a SMILES
  • In the Aqua-colored area type in the SMILES in
    this example enter CCCCCCO
  • Note as you type the SMILES code the structure is
    being drawn in the center of the field (see next
    slide).
  • Click OK to accept the target chemical.

18
SMILES STRUCTURE
19
Target Chemical
  • You have now selected your target chemical.
  • Click on the box next to Substance Information
    this displays the chemical identification
    information.
  • It is important to remember from here on the
    workflow will be based on the structure coded in
    SMILES.
  • The workflow on the first module is now complete
    click Profiling to move to the next module.

20
Chemical Identification Information
21
Profiling
  • Profiling refers to the electronic process of
    retrieving relevant information on the target
    compound, other than environmental fate,
    ecotoxicity, and toxicity data, which are stored
    in the Toolbox.
  • Available information includes likely
    mechanism(s) of action and a survey of organic
    functional group, which form the target chemical.

22
Profiling Target Chemical
  • Select the Profiling methods you wish to use by
    red-checking the box before the name of the
    profiler you wish to use.
  • For this example, select all the profilers for
    the mechanistic methods (see next slide).
  • Click on Apply.

23
Profilers for 1-Hexanal
24
Profiling
  • The results of profiling automatically appear as
    a dropdown box under the target chemical (see
    next slide).

25
Profiles of 1-Hexanal (1)
26
Profiles of 1-Hexanal (2)
  • Very specific profiling results are obtained for
    the target compound.
  • Please note that no DNA-binding mechanisms was
    identified (see side-bar on mutagenicity above).
  • These results will be used to search for suitable
    analogues in the next steps of the exercise.

27
Side-Bar on the Data Tree
  • As one moves through the different modules of the
    workflow the information on the target chemical
    increases.
  • One may find it advantageous to conceal some of
    that information.
  • For example we can hide the substance information
    by double clicking on the

-
28
Side-Bar on Retrieving Concealed Information
  • One can retrieve hidden information by double
    clicking on the
  • This is demonstrated in the next two slides.


29
Double click on small box next to substance
information
30
Substance informationreappears on screen
31
Endpoints
  • Click on Endpoints to move to the next module.
  • Endpoints refer to the electronic process of
    retrieving the environmental fate, ecotoxicity
    and toxicity data that are stored in the Toolbox.
  • Data gathering can be executed in a global
    fashion (i.e., collecting all data of all
    endpoints) or on a more narrowly defined basis
    (e.g., collecting data for a single or limited
    number of endpoints).

32
Side-Bar on Gene Mutation
  • Mutations within a gene are generally
    base-substitutions or small deletions/insertions
    (i.e., frameshifts).
  • Such alteration are generally called point
    mutations.
  • The Ames scheme based on strains of Salmonella
    provide the corresponding experimental data.

33
This Example
  • In this example, we focus our data gathering to
    the-multi-endpoint of mutagenicity and the
    databases OASIS Genotox and ISSCAN.
  • Click on the boxes next to all the databases
    except those entitled ISSCAN Gentox and OASIS
    Genotox.
  • This leaves a black check mark in the box next to
    these two database (the ones we want to search).
  • Click on Gather data.

34
Oasis Genotox Data Gathering
35
Next Step in Data Gathering
  • Toxicity information on the target chemical is
    electronically collected from the selected
    dataset(s).
  • In this example, an insert window appears stating
    there was no data found for the target chemical
    (see next slide).
  • Close the insert window.

36
No data for Target Chemical
37
Recap
  • You have entered the target chemical by SMILES
    and found it to be 1-hexanal with the CAS
    66-25-1.
  • You have profiled the target chemical and found
    no experimental data is currently available for
    1-hexanal.
  • In other words, you have identified a data gap,
    which you would like to fill.
  • Click on Category definition to move to the
    next module.

38
Category Definition
  • This module provides the user with several means
    of grouping chemicals into a toxicologically
    meaningful category that includes the target
    molecule.
  • This is the critical step in the workflow.
  • Several options are available in the Toolbox to
    assist the user in refining the category
    definition.

39
Grouping Methods
  • Allow the user to group chemicals into chemical
    categories according to different measures of
    similarity so that within a category data gaps
    can be filled.
  • For example, starting from a target chemical for
    which a specific DNA binding mechanism is
    identified, analogues can be found which can bind
    by the same mechanism and for which experimental
    results are available.

40
Side-Bar on Mutagens
  • It is important to remember that mutagens are
    really cell-damaging agents, which can create a
    wide array of adverse effects beyond damage to
    DNA.
  • Lets take a moment to review our mechanistic
    profile of the target chemical (see next slide).

41
No DNA Binding
42
Defining the Category
  • In the case of 1-hexanal there is no structural
    evidence that it is a DNA binding compound.
  • Therefore, no grouping by a DNA mechanism is
    possible.
  • We elect to define the category by using
    molecular similarity.
  • Highlight Organic functional groups.
  • Click on Defining Category.

43
Defining the Category
44
Confirmation of Groups
  • An insert window listing the organic function
    groups of the target chemical appears.
  • Click on OK.

45
Naming Category
  • Another insert window listing the default
    category name appears.
  • Click OK.

46
Analogues Identified
47
Recap
  • You have identified a structurally similar
    category for the target chemical
    (1-hexanal).
  • There were 34 similar chemicals identified.
  • Available data on these similar chemicals can now
    be collected.

48
Next Step in Gathering Data
  • Highlight the 35Aldehydes ltANDgtMethyl under
    Single Chemical in the Defined Categories
    box.
  • The inserted window entitled Read Data?
    appears (see next slide).

49
What data to collect?
50
Side-Bar to Data Collection
  • Data can be collected for a wide variety of
    endpoints or for narrowly defined (e.g.,
    endpoint, test scheme) ones.
  • Since data is endpoint specific the data
    selection is presented in a drop-down menu.
  • By double clicking on an endpoint, the data tree
    is expanded.

51
Data Selection
  • To select the data to be read you click on the
    box(s) before the name of the data type.
  • This selects (a red check mark appears) or
    deselects (red check disappears) the data type.
  • Click on the box next to Toxicological
    Information.
  • This places a red check mark in the box next to
    this data type (the one we want to read).
  • Click on OK (see next slide).

52
Reading the Selected Data
53
Analogues
  • The data is automatically collated.
  • There is genotox data on only 15 of the 34
    structurally similar analogues.
  • However, multiple entries of the same test result
    were found and one wants to eliminate
    duplications (see next slide).

54
Click Select Single then Click OK
55
Summary of Toxicological Information for Analogues
56
Side-Bar on Data
  • Note the structure of the compounds with
    experimental results is shown.
  • Double clicking on any structure enlarges the
    view of the structure.
  • Details on the experimental results can be
    retrieved by double-clicking on any cell in the
    data matrix line.

57
Navigating Through the Data Tree
  • The user can navigate through the data tree by
    closing or opening the nodes of the tree.
  • In this example, results from genotox testing are
    available.
  • By double clicking on a cell in the data matrix,
    additional information on the test result (Ames)
    is made available (see next slide).

58
Data Tree
59
Side-Bar on Data Tree
  • Details about the specific assays, in this case
    the different strains of Salmonella typhimurium
    can be observed at the bottom of the screen by
    placing the cursor on the text fragment of the
    test you want more information about (see next
    slide).

60
(No Transcript)
61
Filling Data Gap
  • You are ready to fill the data gap. Click on
    Filling data gap.
  • In this step in the work flow the user is
    provided three options for making a prediction
    for the target molecule.
  • In this example with qualitative mutagenicity
    data we can only use read-across. Click on
    Read-across
  • Highlight the blank space in the
    AMES_mutagenicity line under the column for the
    target chemical
  • Click on All values under data.
  • Click on Apply (see next slide).

62
Filling Data Gap
63
Possible Data Inconsistencies
  • An insert window alerting you to possible data
    inconsistencies appears.
  • Click on the small box before Endpoint.

64
Multiple Endpoint Data
  • In this example what appears is a red checked
    listing of all the Ames data, which you are
    trying to model at the same time.
  • Click OK.

65
Results of Read-across
66
Interpreting the Read-across Figure
  • The resulting plot is experimental results of all
    analogues (Y axis) according to a descriptor (X
    axis) with the default descriptor of log Kow.
  • Note the dots along the bottom of the previous
    screen. The RED dot represents the target
    chemical, while the PURPLE dots the experimental
    results available for the analogues, which are
    used for the read-across the BLUE dots represent
    the experimental results available for the
    analogues but not used for read-across.

67
Interpreting the Read-across Figure
  • Upon further examination of the read-across
    results, noted in an upper corner is an
    aqua-color dot.
  • This represents the lone positive result in the
    Ames tests.
  • By placing the cursor on this dot and double left
    clicking the structural details of this chemical
    appear (see next slide).

68
Details of Outlier
69
Can One Prune the Outlier?
  • Clearly this outlier is structurally dissimilar
    to all the other compounds in the category,
    including the target chemical.
  • It contains a number of functional groups which
    are not present in the target molecule.
  • This dissimilarity is justification for deleting
    (pruning) it from the category.

70
Pruning the Outlier
  • Close the structure detail window.
  • Place the cursor on outlying dot and right click.
  • then click on remove focused.
  • This removes the outlier.
  • Note the read-across results are automatically
    re-tabulated (see next slide).

71
Re-evaluated Read-across
72
Interpretation of Read-across
  • In pruned data, all 10 analogues are
    non-mutagenic in all the Ames assays.
  • The same non-mutagenic potential (value 0f -1.0)
    is, therefore, predicted with confidence for the
    target chemical.

73
Filled Data Gap
  • Click Accept.
  • By accepting the prediction the data gap is
    filled (see next slide).
  • You are now ready to complete the final module
    and download the report.
  • Click on Report to move to the last module.

74
Filled Data Gap
75
Report
  • The final step in the workflow, report, provides
    the user with a downloadable written audit trail
    of what the Toolbox did to arrive at the
    prediction.
  • Click on Show History
  • This study history can be printed or copied to be
    inserted in a more detailed report (see next
    slide).

76
Report
Write a Comment
User Comments (0)
About PowerShow.com