Title: Presentazione di PowerPoint
1QSAR PREDICTIONS OF ORGANIC COMPOUND TROPOSHERIC
DEGRADATIONS Gramatica Paola-Pilutti Pamela-Papa
Ester-Battaini Francesca-Pozzoli
Luca Dep.Struct.Funct.Biol. QSAR Research Unit -
University of Insubria ( Varese - Italy ) Web
http//fisio.dipbsf.uninsubria.it/qsar/
e-mail paola.gramatica_at_uninsubria.it
ABSTRACT The reactions of chemicals with OH, NO3
radicals and ozone are the principal degradation
processes in the troposphere, thus an upper limit
of chemical atmospheric persistence is assessed
by determining the rate costants of such
reactions. Our goal is to develop QSAR models
that make it possible to rapidly predict the
atmospheric degradability of organic chemicals
from a simple description of the molecular
structure. The molecular descriptors used are
1D-, 2D- and 3D-descriptors (constitutional,
topological, WHIM, GETAWAY, quantum chemical and
others). The best descriptor subsets for each
modelling were selected using Genetic
Algorithm-Variable Subset Selection strategy
(GA-VSS) and model calculations were performed by
Ordinary Least Squares regression (OLS). In order
to render the predictions for new compounds more
reliable, the models were validated using
leave-one-out, leave-more-out (50) procedures,
external validation and the scrambling of
responses the reliability of the predictions was
always checked by the leverage approach. All the
obtained models were satisfactory at different
levels (Q283-90). A PCA model based on the
three principal degradation process has been
proposed to evaluate the overall atmospheric
persistence of chemicals the PC1 score obtained
is proposed as an Atmospheric Persistence Index
(ATPIN) and is also modelled by theoretical
molecular descriptors. This model can be used as
an evaluative model for the screening and ranking
of chemicals according to their atmospheric
persistence just starting from their chemical
structure.
INTRODUCTION The troposphere is the principal
recipient of volatile organic chemical (VOC) both
of anthropogenic and biogenic origin. The
persistence of these chemicals is one of the most
relevant factors for the evalutation of their
fate and possible negative effects, in the
enviromental risk assessment. The tropospheric
lifetime of most of the organic compounds from
terrestrial emissions are controlled by the
degradation reaction with the OH radical and
ozone during the day and NO3 radical at night.
New predictive QSAR models for oxidation rate
costant (kOH, kNO3 and kO3), based on different
theoretical molecular descriptors, selected by
Genetic Algorithm as variable subset selection,
are proposed. To make a reliable risk assesment
of a large group of chemicals, model prediction
capability (Q2) is considered of primary
importance and was evaluated using internal
(leave-one-out and leave -more-out) and external
validation procedure. The splitting of the
original data set has been obtained by the
Experimental Design.
MATERIALS and METHODS
EXPERIMENTAL DATA Reaction rate costants with
OH radical, NO3 radical and Ozone for a total of
504 different organic compounds at 298 K were
taken from Atkinson1,2,3. Experimental data are
reported in cm3s-1 mol-1 and transformed in
logarithmic units, then moltiplied by 1 in order
obtain positive values.
MOLECULAR DESCRIPTORS The molecular descriptors
were calculated by the software DRAGON of
Todeschini et al4. A total of 1150 molecular
descriptor of differents kinds were used to
describe compound chemical diversity. The
descriptor typology is In
addition, four quantum-chemical descriptors
(HOMO, LUMO, (HOMO-LUMO)GAP, energies and
ionization potential Eiv),calculated by MOPAC
(PM3 method)5 were always added as molecular
descriptors.
CHEMOMETRIC METHODS Multiple Linear Regression
analysis and variable selection were performed by
the software MOBY-DIGS of Todeschini et al.6,
using the Ordinary Least Squares regression (OLS)
method and GA-VSS9. External validation were
performed on two validation sets obtained with
the splitting at 50 and 75 of the original data
set by Experimental Design procedure, applying
the software DOLPHIN of Todeschini et al7.
Tools of regression diagnostics as residual plots
and Williams plots were used to check the quality
of the best models and define their applicability
regarding to the chemical domain, using the
chemometric package SCAN8. RMS (residual mean
squares) are also reported for model comparison.
RESULT and DISCUSSION In order to find a
relation between the three rate reaction constant
(kOH, kNO3 and kO3) and the structural features
of chemicals, a wide set of theoretical molecular
descriptors were calculated by the software
DRAGON4, the calculated descriptors being able
to catch and represent the different aspects of
the molecular structures, viewing the molecules
is one-dimensional, two-dimensional and
three-dimensional ways. An advantage of the
exclusive use of the theoretical descriptors is
that they are free of the uncertainty of
experimental measurements ( apart from noise
introduced in the model by the training data
set). For a stonger evaluation of model
applicability for prediction on new chemicals ,
the external validation (verified by Q2ext) of
all models is also recommended 10,11 and was
here performed. Experimental design provides a
strategy for selecting thr most dissimilar
molecular structures in a data set , talking
into account the complete structural information
obtained from all the used molecular descriptors
and also the response value. Therefore it has
guaranteed that the chemical composition of the
training and external validations set have well
balanced structural diversity and are also
rapresentative of the entire range of the
response.
OH REACTIVITY MODEL obtained on a selected
training set of 306 chemicals
NO3 REACTIVITY MODEL obtained on a selected
training set of 47 chemicals
DESCRIPTORS (in order of significance) Eiv
ionization energy (nucleophilicity) MATS1m
2D-autocorrelation of Moran (dimension and
shape) nDB number of double bonds nO number
of oxygen atoms H-048 number of hydrogen atoms
on carbon
DESCRIPTORS (in order of significance) Eiv
ionization energy (nucleophilicity) ATS1m
2D-autocorrelation of Moreau-Broto (shape and
dimension) RTe GETAWAY descriptor weighted by
electronegativity (charge distribution)
O3 REACTIVITY MODEL obtained on a selected
training set of 63 chemicals
ATMOSPHERIC PERSISTENCE INDEX (ATPIN) The
behavior of 66 heterogenous chemicals for which
experimental kOH, kNO3 and kO3 values were
available was studied in the Principal Component
space. The first component was found to be the
most important with 88.3 of explained variance
its easily interpreted as an indicator of the
global tropospheric degradability of chemicals.
The PC1 score can be proposed to rank organic
chemicals according to the global tropospheric
degradability (ATPIN) and this index is also
modelled by theoretical molecular
descriptors.
DESCRIPTORS (in order of significance) (HOMO-LUM
O)GAP reactivity -nAB number of aromatic
bonds AMW average molecular weight -nDB number
of double bonds MATS7e and R3e 2D- (Moran) and
3D- (GETAWAY) autocorrelations weighted by
electronegativity (charge distribution)
-Log kO3
- CONCLUSIONS
- New predictive models for kOH kNO3 and kO3 are
proposed. - The proposed models for the tropospheric
degradation of chemicals take advantage of
calculated
imput parametres very simply and rapidly. - Genetic Algorithm is applied for Variable Subset
Selection. - The models have good predictive powers verified
by internal and external validations. - Reaction rate costants with OH, NO3 radicals and
ozone for new chemicals (even not yet
syntetized) can be predicted. - The ranking of organic chemicals for tropospheric
degradability was satisfactory, as realised by
Principal Component Analysis on oxidation rate
constants. - The first component highlights the atmospheric
molecule behaviour the PC1 score is modelled by
theoretical molecular descriptors. This model can
be used for the screening and ranking of
chemicals according to their atmospheric
persistence just starting from their structure.
-Log kNO3
-Log kOH
Atmospheric Persistence
DESCRIPTORS (in order of significance) HOMO
highest occupied molecular orbital
(nucleophilicity) nBnz number of aromatic
rings AMW average molecular weight DELS
molecular electropological variation (charge
distribution)
REFERENCES (1) Atkinson,R., 1989.
J.Phys.Chem.Ref.Data. Monograph 1, 1-246
(2) Atkinson,R., 1991.
J.Phys.Chem.Ref.Data., 20, 461-506 (3)
Atkinson,R., 1984. Chem.Rev.,84, 437-470 (4)
Todeschini R., Consonni V. and Pavan M., 2001.
DRAGON Software for the calculation of
molecular descriptors, rel. 1.12 for Windows.
Free download available at http//www.disat.unim
ib/chm (5) CHEM 3D Cambridge Soft, 1997, MA ,
USA (6) Todeschini, R., 2001. Moby Digs -
Software for multilinear regression analysis and
variable subset selection by Geneitc Algorithm,
rel. 2.3 for Windows, Talete srl, Milan
(Italy)
(7) Todeschini, R. Mauri, A., 2000 DOLPHIN-
Software for Optimal Distance-based Experimental
Design rel 1.1 for Windows, Talete srl,
Milan (Italy) (8) SCAN- Software for
Chemometric Analysis, rel. 1.1 for Windows,
Jerll. Inc., Standard, CA, 1992 (9) Leardi, R.
Boggia, R. Terrile, M.,. J. Chemom., 1992, 6,
267-281 (10) Wold, S. Eriksson, L. Chemometric
Methods in Molecular Design, 1995, VCH, Germany,
309-318 (11) Golbraikh, A. Tropsha, A., J. Mol.
Graph and Mod., 2002, 20, 269-276