Fundamentals of Medicinal Chemistry

(Brent) #1

A.6 Regression analysis


In medicinal chemistry, it is often desirable to obtain mathematical relationships

in the form of equations between sets of data, which have been obtained from

experimental work or calculated using theoretical considerations. Regression

analysis is a group of mathematical methods used to obtain such relationships.

The data is fed into a suitable computer program, which on execution produces

an equation that represents the line that is the best fit for that data. For example,

an investigation indicated that the relationship between the activity and the

partition coefficients of a number of related compounds appeared to be linear

(Figure A6.1). Consequently, this data could be represented mathematically in

the form of the straight line equationy ¼ mxþc. Regression analysis would

calculate the values ofmandcthat gave the line of best fit to the data. When one

is dealing with a linear relationship the analysis is usually carried out using the

method of least squares.

Regression equations do not indicate the accuracy and spread of the data.

Consequently, they are normally accompanied by additional data, which as a

minimum requirement should include the number of observations used (n), the

standard deviation of the observations (s) and the correlation coefficient (r).

The value of the correlation coefficient is a measure of how closely the data

matches the equation. It varies from zero to one. A value ofr¼1 indicates a

perfect match. In medicinal chemistry rvalues greater than 0.9 are usually

regarded as representing an acceptable degree of accuracy, provided they

are obtained using a reasonable number of results with a suitable standard

deviation.

x

xx

x x

x

x

x
x

x

x

x
x

x
x

logP

log 1/C

Figure A6.1 A hypothetical plot of the activity (log1/C) of a series of compounds against the


logarithm of their partition coefficients (logP)


250 APPENDIX 6 REGRESSION ANALYSIS

Free download pdf