3.4 Performance
Evaluation
Various threshold dependent and threshold independent perfor-
mance evaluation metrics can be used for judging the performance
of the machine learning algorithms (seeNote 3).
Sensitivity: This can be defined as % of correctly predicted drug
targets.Sensitivity¼TP
ðÞTPþFN 100 ð 1 ÞSpecificity: This can be defined as % of correctly predicted
nondrug targets.Specificity¼TN
ðÞTNþFP 100 ð 2 ÞAccuracy: This can be defined as the % of correctly predicted
drug targets and nondrug targets.Accuracy¼TPþTN
TPþFPþTNþFN 100 ð 3 ÞMatthews Correlation Coefficient (MCC):For binary classi-
fication problems it’s a useful performance evaluation metric. Its
values ranges from1 to +1 (worse to best).MCC¼ðÞTPTN ðÞFPFN
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðÞTPþFNðÞTPþFPðÞTNþFPðÞTNþFNp ð 4 ÞYouden’s index:This performance evaluation metric gives an
indication about the model’s ability to avoid failures. Higher values
are better.Y¼SensitivityðÞð 1 Specificity 5 ÞArea under the Curve (AUC):The area under the receiver
operation characteristic curves know as AUC and can be used to
summarize the ROC by a single numerical quantity. Its values
ranges from 0 to 1 and is threshold independent [39].
g -means: This is the geometric mean of sensitivity and
specificitygmeans¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
SensitivtySpecificityp
ð 6 Þ3.5 Conclusion
and Future Perspective
Machine learning methods have advantage over sequence align-
ment based methods as they can take into account of the hidden
similarities between features for generating successful prediction
models. Sequence feature generation step should account to cover
as much as possible of chemical and genomic space. Protein–pro-
tein interaction data notably from databases like STRING [40],
BioGRID [41] and Human Protein Reference Databases(HPRD)Human Drug Targets and Their Interactions 27