Computational Drug Discovery and Design

correct bound conformation of a ligand–protein complex (redocking), its ability to assign better scores to high affinity ligands than to decoys (the Directory of Useful Decoys is a practical resource to obtain such decoys) and the ability to produce scores that show some correlation with the measured affinities of known ligands. Regarding validation of supervised machine learning techniques, it can be classified in internal and external validation. In the internal validation approaches, the training set itself is used to assess the model stability and predictive power; in external validation, a holdout sample absolutely independent from the training set is used to test the predictive ability. Though there is a diversity of techniques used for internal validation purposes, the most frequent are cross-validation and Y-randomization. In cross-validation, different proportions of training examples are iteratively held out from the training set used for model devel- opment; the model is thus regenerated without the removed examples and the regenerated model is applied to predict the dependent variable for the held out compound/s. The process is repeated at least until every training compound has been removed from the training set once. When only one compound is held out in each cross-validation round, we will speak of leave- one-out cross validation. If larger subsets of training samples are removed in each round, we will speak of leave-group-out, mul- tifold cross-validation, leave-many-out cross-validation, or leave-some-out cross-validation. Obviously, the more com- pounds removed per cycle, the more challenging the cross- validation test. Cross-validation in general and leave-one-out cross-validation in particular tend to be overoptimistic. Y-randomization involves scrambling the value of the experi- mental/observed dependent across the training instances, thus abolishing the relationship between the response and the molec- ular structure. Since the response is now randomly assigned to the training cases, poor statistical parameters are expected to be found if the model is regenerated from the scrambled data. With regard to external validation, i.e., using an independent test set to establish the model predictive power, it has been regarded as the most rigorous validation step, although some conditions should be met for the results to be reliable: the test sample should be representative of the training sample; at least 20 hold out examples are advised when the test set is randomly chosen from the dataset, and, if possible, at least 50. Some authors suggest that only internal validation is advised for small (<50 examples) datasets. In that case not only valuable and scarce training cases would be lost if resorting to external validation, but the reduced test set will give dubious results. In that scenario, leave-group-out using folds comprising 30% of the training set has provided robust results across several small datasets.

16 Alan Talevi

Computational Drug Discovery and Design

Get our desktop app

Company

Features

Documentation

Resources