Computational Drug Discovery and Design

(backadmin) #1

3.4 Performance
Evaluation


Various threshold dependent and threshold independent perfor-
mance evaluation metrics can be used for judging the performance
of the machine learning algorithms (seeNote 3).
Sensitivity: This can be defined as % of correctly predicted drug
targets.

Sensitivity¼

TP
ðÞTPþFN

 100 ð 1 Þ

Specificity: This can be defined as % of correctly predicted
nondrug targets.

Specificity¼

TN
ðÞTNþFP

 100 ð 2 Þ

Accuracy: This can be defined as the % of correctly predicted
drug targets and nondrug targets.

Accuracy¼

TPþTN

TPþFPþTNþFN

 100 ð 3 Þ

Matthews Correlation Coefficient (MCC):For binary classi-
fication problems it’s a useful performance evaluation metric. Its
values ranges from1 to +1 (worse to best).

MCC¼

ðÞTPTN ðÞFPFN
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðÞTPþFNðÞTPþFPðÞTNþFPðÞTNþFN

p ð 4 Þ

Youden’s index:This performance evaluation metric gives an
indication about the model’s ability to avoid failures. Higher values
are better.

Y¼SensitivityðÞð 1 Specificity 5 Þ

Area under the Curve (AUC):The area under the receiver
operation characteristic curves know as AUC and can be used to
summarize the ROC by a single numerical quantity. Its values
ranges from 0 to 1 and is threshold independent [39].
g -means: This is the geometric mean of sensitivity and
specificity

gmeans¼

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
SensitivtySpecificity

p
ð 6 Þ

3.5 Conclusion
and Future Perspective


Machine learning methods have advantage over sequence align-
ment based methods as they can take into account of the hidden
similarities between features for generating successful prediction
models. Sequence feature generation step should account to cover
as much as possible of chemical and genomic space. Protein–pro-
tein interaction data notably from databases like STRING [40],
BioGRID [41] and Human Protein Reference Databases(HPRD)

Human Drug Targets and Their Interactions 27
Free download pdf