Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

522 INDEX


support vector, 216
support vector machine, 39, 188, 214, 340
support vector machine (SVM) classifier, 341
support vector machines with Gaussian
kernels, 234
support vector regression, 219–222
surrogate splitting, 247
SVMAttributeEval, 423
SVM classifier (Support Vector Machine), 341
SwapValues, 398
SymmetricalUncertAttributeEval, 423
symmetric uncertainty, 291
systematic data errors, 59–60

T
tabular input format, 119
TAN (Tree Augmented Naïve Bayes), 279
television preferences/channels, 28–29
tenfold cross-validation, 150, 151
Tertius, 420
test set, 145
TestSetMaker, 431
text mining, 351–356
text summarization, 352
text to attribute vectors, 309–311
Te x t Vi e w e r, 430
TF ¥IDF, 311
theory, 180
threat detection systems, 357
3-point average recall, 172
threefold cross-validation, 150
ThresholdSelector, 418
time series, 311
TimeSeriesDelta, 400
TimeSeriesTranslate, 396, 399–400
timestamp, 311
TN (True Negatives), 162
tokenization, 310
tokenization in Weka, 399
top-down induction of decision trees, 105
toSource(), 453
toString(), 453, 481, 483
toy problems.Seeexample problems
TP (True Positives), 162

training and testing, 144–146
training set, 296
TrainingSetMaker, 431
TrainTestSplitMaker, 431
transformations.Seeattribute transformations
transforming a multiclass problem into a two-
class one, 334–335
tree
AD (All Dimensions), 280–283
alternating decision, 329, 330, 343
ball, 133–135
decision.Seedecision tree
logistic model, 331
metric, 136
model, 76, 243.See alsomodel tree
numeric prediction, 76
option, 328–331
regression, 76, 243
Tree Augmented Naïve Bayes (TAN), 279
tree classifier in Weka, 404, 406–408
tree diagrams, 82
Trees(subpackages), 451, 453
Tree Visualizer, 389, 390
true negative (TN), 162
true positive (TP), 162
true positive rate, 162–163
True positive rate, 378
t-statistic, 156
t-test, 154
TV preferences/channels, 28–29
two-class mixture model, 264
two-class problem, 73
two-tailed test, 156
two-way split, 63
typographic errors, 59

U
ubiquitous data mining, 358–361
unacceptable contracts, 17
Unclassified instances, 377
Undo, 383
unit, 224
univariate decision tree, 199
universal language, 32

P088407-INDEX.qxd 4/30/05 11:25 AM Page 522

Free download pdf