Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

regression to subsamples of the data and outputs the solution that has the small- est median-squared error. SMOimplements the sequential minimal optimization algorithm for train- ing a support vector classifier (Section 6.3), using polynomial or Gaussian kernels (Platt 1998, Keerthi et al. 2001). Missing values are replaced globally, nominal attributes are transformed into binary ones, and attributes are normalized by default—note that the coefficients in the output are based on the normalized data. Normalization can be turned off, or the input can be stan- dardized to zero mean and unit variance. Pairwise classification is used for multiclass problems. Logistic regression models can be fitted to the support vector machine output to obtain probability estimates. In the multiclass case the predicted probabilities will be coupled pairwise (Hastie and Tibshirani 1998). When working with sparse instances, turn normalization off for faster opera- tion.SMOregimplements the sequential minimal optimization algorithm for regression problems (Smola and Schölkopf 1998). Vo t e d Pe r c e p t r o n is the voted perceptron algorithm (Section 6.3, pages 222–223).Winnow(Section 4.6, pages 126–128) modifies the basic perceptron to use multiplicative updates. The implementation allows for a second multi- plier,b—different from 1/a—to be used in place of the divisions in Figure 4.11, and also provides the balanced version of the algorithm. PaceRegressionbuilds linear regression models using the new technique of Pace regression (Wang and Witten 2002). When there are many attributes, Pace regression is particularly good at determining which ones to discard—indeed, under certain regularity conditions it is provably optimal as the number of attributes tends to infinity. SimpleLogisticbuilds logistic regression models (Section 4.6, pages 121–124), fitting them using LogitBoost with simple regression functions as base learners and determining how many iterations to perform using cross-validation— which supports automatic attribute selection (Landwehr et al. 2003).Logisticis an alternative implementation for building and using a multinomial logistic regression model with a ridge estimator to gaurd against overfitting by penal- izing large coefficients, based on work by le Cessie and van Houwelingen (1992). RBFNetworkimplements a Gaussian radial basis function network (Section 6.3, page 234), deriving the centers and widths of hidden units using k-means and combining the outputs obtained from the hidden layer using logistic regression if the class is nominal and linear regression if it is numeric. The activations of the basis functions are normalized to sum to one before they are fed into the linear models. You can specify k,the number of clusters; the maximum number of logistic regression iterations for nominal-class problems; the minimum stan- dard deviation for the clusters; and the ridge value for regression. If the class is nominal,k-means is applied separately to each class to derive kclusters for each class.

410 CHAPTER 10 | THE EXPLORER

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources