Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

506 INDEX


anomaly detection systems, 357
antecedent, of rule, 65
AODE, 405
Apriori, 419
Apriori method, 141
area under the curve (AUC), 173
ARFF format, 53–55
converting files to, 380–382
Weka, 370, 371
ARFFLoader, 381, 427
arithmetic underflow, 276
assembling the data, 52–53
assessing performance of learning scheme, 286
assignment of key phrases, 353
Associatepanel, 392
association learning, 43
association rules, 69–70, 112–119
binary attributes, 119
generating rules efficiently, 117–118
item sets, 113, 114–115
Weka, 419–420
association-rule learners in Weka, 419–420
attackers, 357
Attribute, 451
attribute(), 480
attribute discretization.Seediscretizing
numeric attributes
attribute-efficient, 128
attribute evaluation methods in Weka, 421,
422–423
attribute filters in Weka, 394, 395–400, 402–403
attributeIndices, 382
attribute noise, 313
attribute-relation file format.SeeARFF format
attributes, 49–52
adding irrelevant, 288
Boolean, 51
class, 53
as columns in tables, 49
combinations of, 65
continuous, 49
discrete, 51
enumerated, 51
highly branching, 86
identification code, 86

independent, 267
integer-valued, 49
nominal, 49
non-numeric, 17
numeric, 49
ordinal, 51
relevant, 289
rogue, 59
selecting, 288
subsets of values in, 80
types in ARFF format, 56
weighting, 237
See alsoorderings
AttributeSelectedClassifier, 417
attribute selection, 286–287, 288–296
attribute evaluation methods in Weka, 421,
422–423
backward elimination, 292, 294
beam search, 293
best-first search, 293
forward selection, 292, 294
race search, 295
schemata search, 295
scheme-independent selection, 290–292
scheme-specific selection, 294–296
searching the attribute space, 292–294
search methods in Weka, 421, 423–425
Weka, 392–393, 420–425
AttributeSelection, 403
attribute subset evaluators in Weka, 421, 422
AttributeSummarizer, 431
attribute transformations, 287, 305–311
principal components analysis, 306–309
random projections, 309
text to attribute vectors, 309–311
time series, 311
attribute weighting method, 237–238
AUC (area under the curve), 173
audit logs, 357
authorship ascription, 353
AutoClass, 269–270, 271
automatic data cleansing, 287, 312–315
anomalies, 314–315
improving decision trees, 312–313
robust regression, 313–314

P088407-INDEX.qxd 4/30/05 11:25 AM Page 506

Free download pdf