Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA)

Description:

Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA) Martin Ott Outline QSAR: The Setting From ... – PowerPoint PPT presentation

Number of Views:429
Avg rating:3.0/5.0
Slides: 37
Provided by: HdH7
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics IV Quantitative Structure-Activity Relationships (QSAR) and Comparative Molecular Field Analysis (CoMFA)


1
Bioinformatics IVQuantitative
Structure-Activity Relationships
(QSAR)andComparative Molecular Field Analysis
(CoMFA)
Martin Ott
2
Outline
  • Introduction
  • Structures and activities
  • Regression techniques PCA, PLS
  • Analysis techniques Free-Wilson, Hansch
  • Comparative Molecular Field Analysis

3
QSAR The Setting
Quantitative structure-activity relationships are
used when there is little or no receptor
information, butthere are measured activities of
(many) compounds They are also useful to
supplement docking studies which take much more
CPU time
4
From Structure to Property
EC50
5
From Structure to Property
LD50
6
From Structure to Property
7
QSAR Which Relationship?
Quantitative structure-activity relationships
correlate chemical/biological activitieswith
structural features or atomic, group ormolecular
properties within a range of structurally
similar compounds
8
Free Energy of Binding
DGbinding DG0 DGhb DGionic DGlipo
DGrot DG0 entropy loss (translat.
rotat.) 5.4 DGhb ideal hydrogen bond
4.7 DGionic ideal ionic interaction
8.3 DGlipo lipophilic contact
0.17 DGrot entropy loss (rotat. bonds)
1.4 (Energies in kJ/mol per unit
feature)
9
Free Energy of Binding andEquilibrium Constants
The free energy of binding is related to the
reaction constants of ligand-receptor complex
formation DGbinding 2.303 RT log K
2.303 RT log (kon / koff) Equilibrium constant
K Rate constants kon (association) and koff
(dissociation)
10
Concentration as Activity Measure
  • A critical molar concentration Cthat produces
    the biological effectis related to the
    equilibrium constant K
  • Usually log (1/C) is used (c.f. pH)
  • For meaningful QSARs, activities needto be
    spread out over at least 3 log units

11
Molecules Are Not Numbers!
Where are the numbers? Numerical
descriptors
12
An Example Capsaicin Analogs
X EC50(mM) log(1/EC50)
H 11.80 4.93
Cl 1.24 5.91
NO2 4.58 5.34
CN 26.50 4.58
C6H5 0.24 6.62
NMe2 4.39 5.36
I 0.35 6.46
NHCHO ? ?
13
An Example Capsaicin Analogs
X log(1/EC50) MR p s Es
H 4.93 1.03 0.00 0.00 0.00
Cl 5.91 6.03 0.71 0.23 -0.97
NO2 5.34 7.36 -0.28 0.78 -2.52
CN 4.58 6.33 -0.57 0.66 -0.51
C6H5 6.62 25.36 1.96 -0.01 -3.82
NMe2 5.36 15.55 0.18 -0.83 -2.90
I 6.46 13.94 1.12 0.18 -1.40
NHCHO ? 10.31 -0.98 0.00 -0.98
MR molar refractivity (polarizability)
parameter p hydrophobicity parameter s
electronic sigma constant (para position) Es
Taft size parameter
14
An Example Capsaicin Analogs
log(1/EC50) -0.89
0.019 MR
0.23 p
-0.31 s -0.14 Es
15
Basic Assumption in QSAR
The structural properties of a compound
contributein a linearly additive way to its
biological activity provided there are no
non-linear dependencies of transport or binding
on some properties
16
Molecular Descriptors
  • Simple counts of features, e.g. of atoms,
    rings,H-bond donors, molecular weight
  • Physicochemical properties, e.g. polarisability,
    hydrophobicity (logP), water-solubility
  • Group properties, e.g. Hammett and Taft
    constants, volume
  • 2D Fingerprints based on fragments
  • 3D Screens based on fragments

17
2D Fingerprints
C N O P S X F Cl Br I Ph CO NH OH Me Et Py CHO SO CC C?C CN Am Im
1 1 1 0 0 1 0 0 1 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0
18
Principal Component Analysis (PCA)
  • Many (gt3) variables to describe objects high
    dimensionality of descriptor data
  • PCA is used to reduce dimensionality
  • PCA extracts the most important factors
    (principal components or PCs) from the data
  • Useful when correlations exist between
    descriptors
  • The result is a new, small set of variables (PCs)
    which explain most of the data variation

19
PCA From 2D to 1D
20
PCA From 3D to 3D-
21
Different Views on PCA
  • Statistically, PCA is a multivariate analysis
    technique closely related to eigenvector analysis
  • In matrix terms, PCA is a decomposition of matrix
    Xinto two smaller matrices plus a set of
    residuals X TPT R
  • Geometrically, PCA is a projection technique in
    which X is projected onto a subspace of reduced
    dimensions

22
Partial Least Squares (PLS)
(compound 1) (compound 2) (compound
3) (compound n)
y1 a0 a1x11 a2x12 a3x13 e1 y2 a0
a1x21 a2x22 a3x23 e2 y3 a0 a1x31
a2x32 a3x33 e3 yn a0 a1xn1
a2xn2 a3xn3 en Y XA E
X independent variables Y dependent variables
23
PLS Cross-validation
  • Squared correlation coefficient R2
  • Value between 0 and 1 (gt 0.9)
  • Indicating explanative power of regression
    equation

With cross-validation
  • Squared correlation coefficient Q2
  • Value between 0 and 1 (gt 0.5)
  • Indicating predictive power of regression
    equation

24
Free-Wilson Analysis
log (1/C) S aixi m xi presence of
group i (0 or 1) ai activity group
contribution of group i m activity value
of unsubstituted compound
25
Free-Wilson Analysis
  • Computationally straightforward
  • Predictions only for substituents already
    included
  • Requires large number of compounds

26
Hansch Analysis
Drug transport and binding affinity depend
nonlinearly on lipophilicity log (1/C) a
(log P)2 b log P c Ss k P
n-octanol/water partition coefficient s
Hammett electronic parameter a,b,c regression
coefficients k constant term
27
Hansch Analysis
  • Fewer regression coefficients needed for
    correlation
  • Interpretation in physicochemical terms
  • Predictions for other substituents possible

28
Pharmacophore
  • Set of structural features in a drug molecule
    recognized by a receptor
  • Sample features
  • ? H-bond donor
  • ? charge
  • ? hydrophobic center
  • Distances, 3D relationship

29
Pharmacophore Selection
Pharmacophore
Dopamine
L lipophilic site A H-bond acceptor D
H-bond donor PD protonated H-bond donor
30
Pharmacophore Selection
Pharmacophore
Dopamine
L lipophilic site A H-bond acceptor D
H-bond donor PD protonated H-bond donor
31
Comparative Molecular Field Analysis (CoMFA)
  • Set of chemically related compounds
  • Common pharmacophore or substructure required
  • 3D structures needed (e.g., Corina-generated)
  • Flexible molecules are folded
    intopharmacophore constraints and aligned

32
CoMFA Alignment
33
CoMFA Grid and Field Probe
(Only one molecule shown for clarity)
34
Electrostatic Potential Contour Lines
35
CoMFA Model Derivation
  • Molecules are positioned in a regular
    gridaccording to alignment
  • Probes are used to determine the molecular field

Van der Waals field (probe is neutral carbon)
Electrostatic field (probe is charged atom)
Evdw S (Airij-12 - Birij-6)
Ec S qiqj / Drij
36
3D Contour Map for Electronegativity
37
CoMFA Pros and Cons
  • Suitable to describe receptor-ligand interactions
  • 3D visualization of important features
  • Good correlation within related set
  • Predictive power within scanned space
  • Alignment is often difficult
  • Training required
Write a Comment
User Comments (0)
About PowerShow.com