0% found this document useful (0 votes)
100 views46 pages

QSAR

The document discusses quantitative structure-activity relationships (QSAR) and methods for predicting molecular properties and biological activities based on molecular structure. It covers linear and nonlinear modeling approaches, variable selection techniques like principal component analysis, molecular descriptors, pharmacophores, docking simulations, and other QSAR methods and considerations.

Uploaded by

Quty Papa Kanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views46 pages

QSAR

The document discusses quantitative structure-activity relationships (QSAR) and methods for predicting molecular properties and biological activities based on molecular structure. It covers linear and nonlinear modeling approaches, variable selection techniques like principal component analysis, molecular descriptors, pharmacophores, docking simulations, and other QSAR methods and considerations.

Uploaded by

Quty Papa Kanna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 46

QSAR

 Qualitative Structure-Activity Relationships


 Can one predict activity (or properties in
QSPR) simply on the basis of knowledge of
the structure of the molecule?
 In other, words, if one systematically changes
a component, will it have a systematic effect
on the activity?
Choice of Model
 Can approach in two directions:
 Simple to complex model
 Complex to simple model
Simplest Model
 Linear relationship between x and y
 Y = mx + b
 Minimize error by least squares:
 (Yi – Y’i)2 = [Yi – (mXi + b)]2

Y’i is predicted value


Least Squares
Correlation coefficient

-1 < r < 1
Another test
Is the line better than the mean?
60
y = 2.9562x - 0.2597
y = 0.0676x - 0.3882 R 2 = 0.8686
R2 = 0.0045
30

0
-15 -10 -5 0 5 10 15 -10 -5 0 5 10 15

-30

-15 -60

A circle 2 lines
100 1000
y = 2.8515x - 31.647 y = 0.0008x + 275.11
2
R = 0.9179 R 2 = 0.978

75 750

50 500

25 250

0 0
10 20 30 40 50 0 200000 400000 600000 800000

One bad point Wrong model


Multiple Regression
 Y = f (X1, X2…Xn)
 Problems:
 Choice of model – linear, polynomial, etc.
 Visualization
 Interpretation
 Computationally demanding
Variable reduction
 Principal Component Analysis
Principal Component
 PC1 = a1,1x1 + a1,2x2 + … + a1,nxn
 PC2 = a2,1x1 + a2,2x2 + … + a2,nxn

 Keep only those components that


possess largest variation
 PC are orthogonal to each other
Exploring QSAR
 Pickup the NONLIN program
 http://www.trinity.edu/sbachrac/drugdesign2007/
 Unzip and install it on your computer
 Read the Read.Me and Nonlin.doc
documentation
 Look at the HeatForm.NLR file with any
word processor
Running NONLIN
 Start an MSDOS window
 Change to directory where the code is
 Cd /d d:\nonlin
 Execute the program with data file
 Nonlin heatForm > output
assignment
 Propose a QSAR scheme to predict the
Hf of the alkanes
Early Examples
 Hammett (1930s-1940s)
COOH COO + H K0

X COOH X COO + H Kp

COOH COO + H Km
X X

Kp
para = log10
K0

meta = log10 Km
K0
Hammett (cont.)
 Now suppose have a related series
CH2COOH CH2COO +H K'x
X X

log10 K'x = 
K'0

 reflect sensitivity to substituent


 reflect sensitivity to different system
Hammett (cont.)
 Linear Free Energy Relationship
G = -2.303RTlog10K
So
G – G0 = -2.303RT
and
G’ – G’0 = -2.303RT
Therefore
G’ – G’0 = (G – G0)
Free-Wilson Analysis
 Log 1/C =  ai + 
where C=predicted activity,
ai= contribution per group, and
=activity of reference
Free-Wilson example
Br
X N
activity of analogs
Y HCl

Log 1/C = -0.30 [m-F] + 0.21 [m-Cl] + 0.43 [m-Br]


+ 0.58 [m-I] + 0.45 [m-Me] + 0.34 [p-F] + 0.77 [p-Cl]
+ 1.02 [p-Br] + 1.43 [p-I] + 1.26 [p-Me] + 7.82

Problems include at least two substituent position


necessary and only predict new combinations of the
substituents used in the analysis.
Hansch Analysis

Log 1/C = a  + b  + c

where
x) = log PRX – log PRH

and log P is the water/octanol partition

This is also a linear free energy relation


Molecular Descriptors
 Simple rules for describing some aspect of a molecule
 Structure
 Property
 2D descriptors only use the atoms and connection
information of the molecule
 Internal 3D descriptors use 3D coordinate information
about each molecule; however, they are invariant to
rotations and translations of the conformation
 External 3D descriptors also use 3D coordinate
information but also require an absolute frame of
reference (e.g., molecules docked into the same
receptor).
Descriptor examples
 Physical Properties
 MW
 log P (ocanol/water partition)
 bp, mp
 Dipole moment
 solubility
Descriptor examples
 Structural descriptors
 2D
 Atom/Bond counts
 Number non-H atoms
 Number of rotatable bonds
 Number of each functional group
 2C chains, 3C chains, 4C chains, 5C chains, etc.
 Rings and their size
 3D
 Number of accessible conformations
 Surface area
Topological Descriptors
 Weiner Path Index
Distance Matrix
6
0123423
4 1012312
2 2101221
3 5 3210132
1
7 1234043
2123403
3212330

w =  dij w = 46
i j>i
Topological Descriptors
Randic Index
1
 valence 2
3
at vertex
1 3 1

bond values 3
as product 3 9 2
of above 6
3

edge term .577


as reciprocal of .333
.577 .707
square rooot of
.408
above bond values
.577

Sum of
edge terms 3.179
Predict bp of alkanes
100
y = 1.5225x + 7.2917
R2 = 0.9547
90

80
bp

70

60

50
30 35 40 45 50 55 60 65
Weiner Index
3D Molecular Descriptors
 Potential energy
 Solvation energy
 Water accessible surface area
 Water accessible surface area of all
atoms with positive (negative) partial
charge
Pharmacophore
 Specification of the spatial arrangement
of a small number of atoms or
functional groups
 With the model in hand, search
databases for molecules that fit this
spatial environment
Creating a Pharmacophore

O O

O O

OH
OH
3D Pharmacophore searching
 With the pharmacophore in hand,
search databases containing 3-D
structure of molecules for molecules
that fit
 Can rank these “hits” using scoring
system described later
Pharmacophore Descriptors
 Number of acidic atoms
 Number of basic atoms
 Number of hydrogen bond donor atoms
 Number of hydrophobic atoms
 Sum of VDW surface areas of hydrophobic atoms
Lipinski’s Rule of 5
 potential drug candidates should
 Have 5 or fewer H-bond donors (expressed as the
sum of OHs and NHs)
 Have a MW <500
 LogP less than 5
 Have 10 or less H-bond acceptors (expressed as
the sum of Ns and Os)

Adv. Drug Delivery Rev., 1997, 23, 3


Docking
 Interact a ligand with a receptor
 Need to do the following
 A) select appropriate ligands
 B) select appropriate conformation of receptor
 C) select appropriate conformations of ligands
 D) combine the ligand and receptor (docking)
 E) evaluate these combinations and rank order
them
Selection of Ligands
 Want drug-like molecules
 250< MW < 500
 Lipinski’s rules
 Search through databases
 Available Chemicals Directory (ACD)
 World Drug Index
 NCI Drug database
 In-house databases
Receptor Conformation
 Usually Receptor is assumed to be static
 Get structure from X-ray or NMR
experiment
 Protein Data Bank (http://www.rcsb.org/pdb/)
41385 Structures
Ligand Conformation
 Rigid or flexible
 If rigid, optimize the structure then use it
throughout the docking procedure
 If flexible, can
 A) create a set of low energy conformations and
then use this set as a collection of rigid structures
in docking
 B) optimize structure within active site of receptor,
i.e. dock and optimize together
Docking
 Place ligand in appropriate location for
interacting with the receptor
 Methodological problem:
 1) No best method for defining shape
 2) No general solution for packing irregular
objects (the knapsack problem)
Docking Algorithmic
Components
 Receptor and Ligand Description (keep in mind
relative errors of structures, etc.)
 Bind the Ligand to Receptor
(configuration/conformation search)
 Geometric search (match ligand and receptor site
descriptions)
 Search for minimum energy - molecular dynamics
(MD) or monte carlo (MC)
 Evaluation of the dock (Gbind) also called
scoring
Descriptor Matching Method
DOCK program
 1) Generate molecular surface for receptor

 2) Generate spheres to fill the active site


(usually 30-50 spheres)
 3) Match sphere centers to the ligand atoms
(originally just lowest E conformer, now use multiple
conformers, but still rigid) – generates 10K orientations per
ligand – Shape-driven!
 4) Score the interaction
Fragment-Joining Method
FlexX, LUDI
 Place base fragments into microstates

of the active site (Fragments can be small


molecules like benzene, formaldehyde,
formamide, naphthol, etc.)
 Optimize position of the Base fragment
 Join fragments with small connecting
chains made of CH2, CO, CONH, etc.
Scoring (evaluation of the dock)
 Want to quickly evaluate the strength of
the interaction between ligand and
receptor
 Full free energy computation
 Expensive
 Requires excellent force fields
 Empirical method
 Fast and cheap
 Requires fitting to a broad set of ligand/receptor
complexes
Empirical Scoring
 Method of Bohm (LUDI, FlexX, etc.)
Gbind = G0 + h-bonds Ghb f(R,) + ion Gion f(R,)
+ Glipo Alipo + Grot NROT

G0 reduction in binding energy due to loss of


rotation and translation of ligand
Ghb contribution from ideal hydrogen bond
Gion contribution from ionic interactions
Glipo contribution from lipophilic interactions
Grot contribution from freezing rotations within ligand
Bohm Method (cont.)
 f(R,) are penalty functions for non-ideal
interactions – distances too short/long, angles
not linear
f (R,) = f1(R)f2()

f1(R) = 1, R<0.2 Å f2() = 1, <30°


1-(R-0.2)/0.4, R<0.6 Å 1-(-30)/50, <80°
0, R>0.6 Å 0, >80°

R is deviation from ideal H...O/N distance of 1.9 Å


 is deviation from ideal N/O-H…O/N angle of 180°
Bohm Method (cont.)
 Alipo is the lipophilic contact surface,
evaluated by a coarse grid of boxes
 NROT is the number of rotatable bonds
– acyclic sp3-sp3, sp3-sp2 and sp2-sp2. No
terminal groups or flexibility of rings
incorporated.

H.-J. Bohm, J. Comput.-Aided Mol. Des., 1994, 8, 243-256


Scoring alternatives
 Many variations on Bohm scheme
 Buried Polar term, desolvation term, different
forms for the lipophilic term, include metal
bonding, etc.
 Combine scoring functions, i.e. QSAR with
scoring functions as variables
 Use empirical score to select set of hits, then
refine with free energy minimization

You might also like