To evaluate supervised learning, a training-testing framework is used
in which the labeled dataset is partitioned into two parts, one for training
and the other for testing. Different approaches for evaluating supervised
learning such as leave-one-out ork-fold cross validation were discussed.
Any clustering algorithm requires the selection of a distance measure.
We discussed partitional clustering algorithms andk-means from these
algorithms, as well as methods of evaluating clustering algorithms. To
evaluate clustering algorithms, one can use clustering quality measures such
as cohesiveness, which measures how close instances are inside clusters, or
separateness, which measures how separate different clusters are from one
another. Silhouette index combines the cohesiveness and separateness into
one measure.
5.7 Bibliographic Notes
In addition to the data mining categories covered in this chapter, there
are other important categories in the area of data mining and machine
learning. In particular, an interesting category issemi-supervisedlearn-
ing. In semi-supervised learning, the label is available for some instances,