Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1
SimpleKMeansclusters data using k-means; the number of clusters is specified
by a parameter.Cobwebimplements both the Cobweb algorithm for nominal
attributes and the Classit algorithm for numeric attributes. The ordering and
priority of the merging and splitting operators differs between the original
Cobweb and Classit papers (where it is somewhat ambiguous). This imple-
mentation always compares four different ways of treating a new instance and
chooses the best: adding it to the best host, making it into a new leaf, merging
the two best hosts and adding it to the merged node, and splitting the best host
and adding it to one of the splits.Acuityand cutoffare parameters.
FarthestFirstimplements the farthest-first traversal algorithm of Hochbaum
and Shmoys (1985), cited by Sanjoy Dasgupta (2002); a fast, simple, approxi-
mate clusterer modeled on k-means.MakeDensityBasedClustereris a meta-
clusterer that wraps a clustering algorithm to make it return a probability
distribution and density. To each cluster it fits a discrete distribution or a
symmetric normal distribution (whose minimum standard deviation is a
parameter).

10.7 Association-rule learners


Weka has three association-rule learners, listed in Table 10.8.Aprioriimple-
ments the Apriori algorithm (Section 4.5). It starts with a minimum support of
100% of the data items and decreases this in steps of 5% until there are at least
10 rules with the required minimum confidence of 0.9 or until the support has

10.7 ASSOCIATION-RULE LEARNERS 419


Table 10.7 Clustering algorithms.

Name Function

EM Cluster using expectation maximization
Cobweb Implements the Cobweb and Classit clustering algorithms
FarthestFirst Cluster using the farthest first traversal algorithm
MakeDensityBasedClusterer Wrap a clusterer to make it return distribution and density
SimpleKMeans Cluster using the k-means method

Table 10.8 Association-rule learners.

Name Function

Apriori Find association rules using the Apriori algorithm
PredictiveApriori Find association rules sorted by predictive accuracy
Tertius Confirmation-guided discovery of association or classification rules
Free download pdf