Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

INDEX 521

sigmoid kernel, 219 Simple CLI, 371, 449, 450 SimpleKMeans, 418–419 simple linear regression, 326 SimpleLinearRegression, 409 SimpleLogistic, 410 simplest-first ordering, 34 simplicity-first methodology, 83, 183 single-attribute evaluators in Weka, 421, 422–423 single-consequent rules, 118 single holdout procedure, 150 sister-of-relation, 46–47 SMO, 410 smoothing locally weighted linear regression, 252 model tree, 244, 251 SMOreg, 410 software programs.SeeWeka workbench sorting, avoiding repeated, 190 soybean data, 18–22 spam, 356–357 sparse data, 55–56 sparse instance in Weka, 401 SparseToNonSparse, 401 specificity, 173 specific-to-general search bias, 34 splitData(), 480 splitter nodes, 329 splitting clustering, 254–255, 257 decision tree, 62–63 entropy-based discretization, 301 massive datasets, 347 model tree, 245, 247 subexperiments, 447 surrogate, 247 SpreadSubsample, 403 squared-error loss function, 227 squared error measures, 177–179 stacked generalization, 332 stacking, 332–334 Stacking, 417 StackingC, 417 stale data, 60

standard deviation reduction (SDR), 245 standard deviations from the mean, 148 Standardize, 398 standardizing, 56 statistical modeling, 88–97 document classification, 94–96 missing values, 92–94 normal-distribution assumption, 92 numeric attributes, 92–94 statistics, 29–30 Statusbox, 380 step function, 227, 228 stochastic algorithms, 348 stochastic backpropagation, 232 stopping criterion, 293, 300, 326 stopwords, 310, 352 stratification, 149, 151 stratified holdout, 149 StratifiedRemoveFolds, 403 stratified cross-validation, 149 StreamableFilter, 456 string attributes, 54–55 string conversion in Weka, 399 string table, 55 StringToNominal, 399 StringToWordVector, 396, 399, 401, 462 StripChart, 431 structural patterns, 6 structure learning by conditional independence tests, 280 student’s distribution with k–1 degrees of freedom, 155 student’s t-test, 154, 184 subexperiments, 447 subsampling in Weka, 400 subset evaluators in Weka, 421, 422 subtree raising, 193, 197 subtree replacement, 192–193, 197 success rate, 173 supervised attribute filters in Weka, 402–403 supervised discretization, 297, 298 supervised filters in Weka, 401–403 supervised instance filters in Weka, 402, 403 supervised learning, 43 support, 69, 113

P088407-INDEX.qxd 4/30/05 11:25 AM Page 521

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources