Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

506 INDEX

anomaly detection systems, 357 antecedent, of rule, 65 AODE, 405 Apriori, 419 Apriori method, 141 area under the curve (AUC), 173 ARFF format, 53–55 converting files to, 380–382 Weka, 370, 371 ARFFLoader, 381, 427 arithmetic underflow, 276 assembling the data, 52–53 assessing performance of learning scheme, 286 assignment of key phrases, 353 Associatepanel, 392 association learning, 43 association rules, 69–70, 112–119 binary attributes, 119 generating rules efficiently, 117–118 item sets, 113, 114–115 Weka, 419–420 association-rule learners in Weka, 419–420 attackers, 357 Attribute, 451 attribute(), 480 attribute discretization.Seediscretizing numeric attributes attribute-efficient, 128 attribute evaluation methods in Weka, 421, 422–423 attribute filters in Weka, 394, 395–400, 402–403 attributeIndices, 382 attribute noise, 313 attribute-relation file format.SeeARFF format attributes, 49–52 adding irrelevant, 288 Boolean, 51 class, 53 as columns in tables, 49 combinations of, 65 continuous, 49 discrete, 51 enumerated, 51 highly branching, 86 identification code, 86

independent, 267 integer-valued, 49 nominal, 49 non-numeric, 17 numeric, 49 ordinal, 51 relevant, 289 rogue, 59 selecting, 288 subsets of values in, 80 types in ARFF format, 56 weighting, 237 See alsoorderings AttributeSelectedClassifier, 417 attribute selection, 286–287, 288–296 attribute evaluation methods in Weka, 421, 422–423 backward elimination, 292, 294 beam search, 293 best-first search, 293 forward selection, 292, 294 race search, 295 schemata search, 295 scheme-independent selection, 290–292 scheme-specific selection, 294–296 searching the attribute space, 292–294 search methods in Weka, 421, 423–425 Weka, 392–393, 420–425 AttributeSelection, 403 attribute subset evaluators in Weka, 421, 422 AttributeSummarizer, 431 attribute transformations, 287, 305–311 principal components analysis, 306–309 random projections, 309 text to attribute vectors, 309–311 time series, 311 attribute weighting method, 237–238 AUC (area under the curve), 173 audit logs, 357 authorship ascription, 353 AutoClass, 269–270, 271 automatic data cleansing, 287, 312–315 anomalies, 314–315 improving decision trees, 312–313 robust regression, 313–314

P088407-INDEX.qxd 4/30/05 11:25 AM Page 506

Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

Get our desktop app

Company

Features

Documentation

Resources