QSAR Application Toolbox: Third Step Data Gap Filling ReadAcross by Molecular Similarity - PowerPoint PPT Presentation

1 / 76

About This Presentation

Title:

QSAR Application Toolbox: Third Step Data Gap Filling ReadAcross by Molecular Similarity

Description:

... by Molecular Similarity) ... for a target chemical by molecular similarity, ... SMILES (simplified molecular information line entry system) notation, and ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 77

Provided by: schu3

Category:

more less

Transcript and Presenter's Notes

Title: QSAR Application Toolbox: Third Step Data Gap Filling ReadAcross by Molecular Similarity

1
QSAR Application ToolboxThird Step - Data Gap
Filling(Read-Across by Molecular Similarity)
2
Background

This is a step-by-step presentation designed to
take the you through the workflow of the Toolbox
in a data-gap filling exercise using read-across
based on molecular similarity with data pruning.
If you are a novice user of the Toolbox you may
wish to review the Getting Started document
available at www.oecd.org/env/existingchemicals/q
sar

3
Objectives-1

This presentation reviews a number of
functionalities of the Toolbox
Entering and Profiling a target chemical,
Identifying analogues for a target chemical,
Retrieving experimental results available for
those analogues, and
Filling data gaps by read-across.

4
Objectives-2

This presentation also introduces several other
functionalities of the Toolbox
Use of the Flexible Track
Entering a target chemical by SMILES notation,
Identify analogues for a target chemical by
molecular similarity,
Retrieve experimental results for multiple
endpoints

5
Specific Aims

To review the work flow of the Toolbox.
To review the use of the six modules of the
Toolbox.
To review the basic functionalities within each
module.
To introduce the user to new functionalities with
selected modules
To explain to the rationale behind each step of
the exercise.

6
The Exercise

In this exercise we will predict the Ames
mutagenicity potential for an untested compound,
(n-hexanal) SMILES CCCCCCO), which is the
target chemical.
This prediction will be accomplished by
collecting a small set of test data for chemicals
considered to be in the same category as the
target molecule.
The category will be defined by molecular
similarity, in particular Organic functional
groups.
The prediction itself will be made by
read-across.

7
Read-across the Analogue Approach

Read-across can be used to estimate missing data
from a single or limited number of chemicals
using an analogue approach.
In the analogue approach, endpoint information
for a single or small number of tested chemicals
is used to predict the same endpoint for an
untested chemical that is considered to be
similar.

8
Analogous Chemicals

Previously you learned that analogous sets of
chemicals are often selected based on the
hypothesis that the toxicological effects of each
member of the set will show a common behavior.
For this reason mechanistic profilers and
grouping methods have been shown to be of great
value in using the Toolbox.
However, there are cases where the mechanistic
profilers and grouping methods are inadequate and
one is forced to rely on molecular similarity to
form a category.
The Toolbox allows one to develop a category by
using either organic functional groups or
structural similarity.
Since there is no preferred way of identifying
structural similarity the user is guided to use
organic functional groups as a first option.

9
Side-Bar On Mutagenesis

Mutagens do not create mutations.
Mutagens create DNA damage.
Mutations are changes in nucleotide sequence.
Mutagenesis is a cellular process requiring
enzymes and/or DNA replication, thus cells create
mutations.

10
Tracks

After opening the Toolbox, the user has to choose
between three use tracks (or workflows)
(Q)SAR Track
Category Track
Flexible Track
Since you are becoming more familiar with the
functionalities of the Toolbox, select the
Flexible Track.

11
Tracks and Workflow
12
Workflow

Remember each track follows the same workflow
Chemical Input
Profiling
Endpoints
Category Definition
Filling Data Gaps
Reporting

13
Chemical Input

Click on the Flexible Track.
This takes you to the first module, which is
Chemical input.
This module provides the user with several means
of entering the chemical of interest or the
target chemical.
Since all subsequent functions are based on
chemical structure, the goal here is to make sure
the molecular structure assigned is the correct
one.

14
Chemical Input Screen
15
Ways of Entering a Chemical

Remember there are several ways to enter a target
chemical and the most often used are
CAS,
SMILES (simplified molecular information line
entry system) notation, and
Drawing the structure.
Click on SMILES.
This inserts the window entitled Structure
editor (see next slide).

16
Blank Structure Editor Screen
17
Entering a SMILES

In the Aqua-colored area type in the SMILES in
this example enter CCCCCCO
Note as you type the SMILES code the structure is
being drawn in the center of the field (see next
slide).
Click OK to accept the target chemical.

18
SMILES STRUCTURE
19
Target Chemical

You have now selected your target chemical.
Click on the box next to Substance Information
this displays the chemical identification
information.
It is important to remember from here on the
workflow will be based on the structure coded in
SMILES.
The workflow on the first module is now complete
click Profiling to move to the next module.

20
Chemical Identification Information
21
Profiling

Profiling refers to the electronic process of
retrieving relevant information on the target
compound, other than environmental fate,
ecotoxicity, and toxicity data, which are stored
in the Toolbox.
Available information includes likely
mechanism(s) of action and a survey of organic
functional group, which form the target chemical.

22
Profiling Target Chemical

Select the Profiling methods you wish to use by
red-checking the box before the name of the
profiler you wish to use.
For this example, select all the profilers for
the mechanistic methods (see next slide).
Click on Apply.

23
Profilers for 1-Hexanal
24
Profiling

The results of profiling automatically appear as
a dropdown box under the target chemical (see
next slide).

25
Profiles of 1-Hexanal (1)
26
Profiles of 1-Hexanal (2)

Very specific profiling results are obtained for
the target compound.
Please note that no DNA-binding mechanisms was
identified (see side-bar on mutagenicity above).
These results will be used to search for suitable
analogues in the next steps of the exercise.

27
Side-Bar on the Data Tree

As one moves through the different modules of the
workflow the information on the target chemical
increases.
One may find it advantageous to conceal some of
that information.
For example we can hide the substance information
by double clicking on the

-
28
Side-Bar on Retrieving Concealed Information

One can retrieve hidden information by double
clicking on the
This is demonstrated in the next two slides.

29
Double click on small box next to substance
information
30
Substance informationreappears on screen
31
Endpoints

Click on Endpoints to move to the next module.
Endpoints refer to the electronic process of
retrieving the environmental fate, ecotoxicity
and toxicity data that are stored in the Toolbox.
Data gathering can be executed in a global
fashion (i.e., collecting all data of all
endpoints) or on a more narrowly defined basis
(e.g., collecting data for a single or limited
number of endpoints).

32
Side-Bar on Gene Mutation

Mutations within a gene are generally
base-substitutions or small deletions/insertions
(i.e., frameshifts).
Such alteration are generally called point
mutations.
The Ames scheme based on strains of Salmonella
provide the corresponding experimental data.

33
This Example

In this example, we focus our data gathering to
the-multi-endpoint of mutagenicity and the
databases OASIS Genotox and ISSCAN.
Click on the boxes next to all the databases
except those entitled ISSCAN Gentox and OASIS
Genotox.
This leaves a black check mark in the box next to
these two database (the ones we want to search).
Click on Gather data.

34
Oasis Genotox Data Gathering
35
Next Step in Data Gathering

Toxicity information on the target chemical is
electronically collected from the selected
dataset(s).
In this example, an insert window appears stating
there was no data found for the target chemical
(see next slide).
Close the insert window.

36
No data for Target Chemical
37
Recap

You have entered the target chemical by SMILES
and found it to be 1-hexanal with the CAS
66-25-1.
You have profiled the target chemical and found
no experimental data is currently available for
1-hexanal.
In other words, you have identified a data gap,
which you would like to fill.
Click on Category definition to move to the
next module.

38
Category Definition

This module provides the user with several means
of grouping chemicals into a toxicologically
meaningful category that includes the target
molecule.
This is the critical step in the workflow.
Several options are available in the Toolbox to
assist the user in refining the category
definition.

39
Grouping Methods

Allow the user to group chemicals into chemical
categories according to different measures of
similarity so that within a category data gaps
can be filled.
For example, starting from a target chemical for
which a specific DNA binding mechanism is
identified, analogues can be found which can bind
by the same mechanism and for which experimental
results are available.

40
Side-Bar on Mutagens

It is important to remember that mutagens are
really cell-damaging agents, which can create a
wide array of adverse effects beyond damage to
DNA.
Lets take a moment to review our mechanistic
profile of the target chemical (see next slide).

41
No DNA Binding
42
Defining the Category

In the case of 1-hexanal there is no structural
evidence that it is a DNA binding compound.
Therefore, no grouping by a DNA mechanism is
possible.
We elect to define the category by using
molecular similarity.
Highlight Organic functional groups.
Click on Defining Category.

43
Defining the Category
44
Confirmation of Groups

An insert window listing the organic function
groups of the target chemical appears.
Click on OK.

45
Naming Category

Another insert window listing the default
category name appears.
Click OK.

46
Analogues Identified
47
Recap

You have identified a structurally similar
category for the target chemical
(1-hexanal).
There were 34 similar chemicals identified.
Available data on these similar chemicals can now
be collected.

48
Next Step in Gathering Data

Highlight the 35Aldehydes ltANDgtMethyl under
Single Chemical in the Defined Categories
box.
The inserted window entitled Read Data?
appears (see next slide).

49
What data to collect?
50
Side-Bar to Data Collection

Data can be collected for a wide variety of
endpoints or for narrowly defined (e.g.,
endpoint, test scheme) ones.
Since data is endpoint specific the data
selection is presented in a drop-down menu.
By double clicking on an endpoint, the data tree
is expanded.

51
Data Selection

To select the data to be read you click on the
box(s) before the name of the data type.
This selects (a red check mark appears) or
deselects (red check disappears) the data type.
Click on the box next to Toxicological
Information.
This places a red check mark in the box next to
this data type (the one we want to read).
Click on OK (see next slide).

52
Reading the Selected Data
53
Analogues

The data is automatically collated.
There is genotox data on only 15 of the 34
structurally similar analogues.
However, multiple entries of the same test result
were found and one wants to eliminate
duplications (see next slide).

54
Click Select Single then Click OK
55
Summary of Toxicological Information for Analogues
56
Side-Bar on Data

Note the structure of the compounds with
experimental results is shown.
Double clicking on any structure enlarges the
view of the structure.
Details on the experimental results can be
retrieved by double-clicking on any cell in the
data matrix line.

57
Navigating Through the Data Tree

The user can navigate through the data tree by
closing or opening the nodes of the tree.
In this example, results from genotox testing are
available.
By double clicking on a cell in the data matrix,
additional information on the test result (Ames)
is made available (see next slide).

58
Data Tree
59
Side-Bar on Data Tree

Details about the specific assays, in this case
the different strains of Salmonella typhimurium
can be observed at the bottom of the screen by
placing the cursor on the text fragment of the
test you want more information about (see next
slide).

60
(No Transcript)
61
Filling Data Gap

You are ready to fill the data gap. Click on
Filling data gap.
In this step in the work flow the user is
provided three options for making a prediction
for the target molecule.
In this example with qualitative mutagenicity
data we can only use read-across. Click on
Read-across
Highlight the blank space in the
AMES_mutagenicity line under the column for the
target chemical
Click on All values under data.
Click on Apply (see next slide).

62
Filling Data Gap
63
Possible Data Inconsistencies

An insert window alerting you to possible data
inconsistencies appears.
Click on the small box before Endpoint.

64
Multiple Endpoint Data

In this example what appears is a red checked
listing of all the Ames data, which you are
trying to model at the same time.
Click OK.

65
Results of Read-across
66
Interpreting the Read-across Figure

The resulting plot is experimental results of all
analogues (Y axis) according to a descriptor (X
axis) with the default descriptor of log Kow.
Note the dots along the bottom of the previous
screen. The RED dot represents the target
chemical, while the PURPLE dots the experimental
results available for the analogues, which are
used for the read-across the BLUE dots represent
the experimental results available for the
analogues but not used for read-across.

67
Interpreting the Read-across Figure

Upon further examination of the read-across
results, noted in an upper corner is an
aqua-color dot.
This represents the lone positive result in the
Ames tests.
By placing the cursor on this dot and double left
clicking the structural details of this chemical
appear (see next slide).

68
Details of Outlier
69
Can One Prune the Outlier?

Clearly this outlier is structurally dissimilar
to all the other compounds in the category,
including the target chemical.
It contains a number of functional groups which
are not present in the target molecule.
This dissimilarity is justification for deleting
(pruning) it from the category.

70
Pruning the Outlier

Close the structure detail window.
Place the cursor on outlying dot and right click.
then click on remove focused.
This removes the outlier.
Note the read-across results are automatically
re-tabulated (see next slide).

71
Re-evaluated Read-across
72
Interpretation of Read-across

In pruned data, all 10 analogues are
non-mutagenic in all the Ames assays.
The same non-mutagenic potential (value 0f -1.0)
is, therefore, predicted with confidence for the
target chemical.

73
Filled Data Gap

Click Accept.
By accepting the prediction the data gap is
filled (see next slide).
You are now ready to complete the final module
and download the report.
Click on Report to move to the last module.

74
Filled Data Gap
75
Report

The final step in the workflow, report, provides
the user with a downloadable written audit trail
of what the Toolbox did to arrive at the
prediction.
Click on Show History
This study history can be printed or copied to be
inserted in a more detailed report (see next
slide).

76
Report

Write a Comment

User Comments (0)