Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

INDEX 521


sigmoid kernel, 219
Simple CLI, 371, 449, 450
SimpleKMeans, 418–419
simple linear regression, 326
SimpleLinearRegression, 409
SimpleLogistic, 410
simplest-first ordering, 34
simplicity-first methodology, 83, 183
single-attribute evaluators in Weka, 421,
422–423
single-consequent rules, 118
single holdout procedure, 150
sister-of-relation, 46–47
SMO, 410
smoothing
locally weighted linear regression, 252
model tree, 244, 251
SMOreg, 410
software programs.SeeWeka workbench
sorting, avoiding repeated, 190
soybean data, 18–22
spam, 356–357
sparse data, 55–56
sparse instance in Weka, 401
SparseToNonSparse, 401
specificity, 173
specific-to-general search bias, 34
splitData(), 480
splitter nodes, 329
splitting
clustering, 254–255, 257
decision tree, 62–63
entropy-based discretization, 301
massive datasets, 347
model tree, 245, 247
subexperiments, 447
surrogate, 247
SpreadSubsample, 403
squared-error loss function, 227
squared error measures, 177–179
stacked generalization, 332
stacking, 332–334
Stacking, 417
StackingC, 417
stale data, 60

standard deviation reduction (SDR), 245
standard deviations from the mean, 148
Standardize, 398
standardizing, 56
statistical modeling, 88–97
document classification, 94–96
missing values, 92–94
normal-distribution assumption, 92
numeric attributes, 92–94
statistics, 29–30
Statusbox, 380
step function, 227, 228
stochastic algorithms, 348
stochastic backpropagation, 232
stopping criterion, 293, 300, 326
stopwords, 310, 352
stratification, 149, 151
stratified holdout, 149
StratifiedRemoveFolds, 403
stratified cross-validation, 149
StreamableFilter, 456
string attributes, 54–55
string conversion in Weka, 399
string table, 55
StringToNominal, 399
StringToWordVector, 396, 399, 401, 462
StripChart, 431
structural patterns, 6
structure learning by conditional independence
tests, 280
student’s distribution with k–1 degrees of
freedom, 155
student’s t-test, 154, 184
subexperiments, 447
subsampling in Weka, 400
subset evaluators in Weka, 421, 422
subtree raising, 193, 197
subtree replacement, 192–193, 197
success rate, 173
supervised attribute filters in Weka, 402–403
supervised discretization, 297, 298
supervised filters in Weka, 401–403
supervised instance filters in Weka, 402, 403
supervised learning, 43
support, 69, 113

P088407-INDEX.qxd 4/30/05 11:25 AM Page 521

Free download pdf