Data Mining: Practical Machine Learning Tools and Techniques, Second Edition

(Brent) #1

CONTENTS xv



  • 1.4 Machine learning and statistics

  • 1.5 Generalization as search

    • Enumerating the concept space

    • Bias



  • 1.6 Data mining and ethics

  • 1.7 Further reading

    • 2 Input: Concepts, instances, and attributes



  • 2.1 What’s a concept?

  • 2.2 What’s in an example?

  • 2.3 What’s in an attribute?

  • 2.4 Preparing the input

    • Gathering the data together

    • ARFF format

    • Sparse data

    • Attribute types

    • Missing values

    • Inaccurate values

    • Getting to know your data



  • 2.5 Further reading

    • 3 Output: Knowledge representation



  • 3.1 Decision tables

  • 3.2 Decision trees

  • 3.3 Classification rules

  • 3.4 Association rules

  • 3.5 Rules with exceptions

  • 3.6 Rules involving relations

  • 3.7 Trees for numeric prediction

  • 3.8 Instance-based representation

  • 3.9 Clusters

  • 3.10 Further reading

    • 4 Algorithms: The basic methods



  • 4.1 Inferring rudimentary rules

    • Missing values and numeric attributes

    • Discussion



  • 4.2 Statistical modeling

    • Missing values and numeric attributes

    • Bayesian models for document classification

    • Discussion



  • 4.3 Divide-and-conquer: Constructing decision trees

    • Calculating information

    • Highly branching attributes

    • Discussion



  • 4.4 Covering algorithms: Constructing rules

    • Rules versus trees

    • A simple covering algorithm

    • Rules versus decision lists



  • 4.5 Mining association rules

    • Item sets

    • Association rules

    • Generating rules efficiently

    • Discussion



  • 4.6 Linear models

    • Numeric prediction: Linear regression

    • Linear classification: Logistic regression

    • Linear classification using the perceptron

    • Linear classification using Winnow



  • 4.7 Instance-based learning

    • The distance function

    • Finding nearest neighbors efficiently

    • Discussion



  • 4.8 Clustering

    • Iterative distance-based clustering

    • Faster distance calculations

    • Discussion



  • 4.9 Further reading

    • 5 Credibility: Evaluating what’s been learned

    • 5.1 Training and testing

    • 5.2 Predicting performance

    • 5.3 Cross-validation

    • 5.4 Other estimates

      • Leave-one-out

      • The bootstrap



    • 5.5 Comparing data mining methods

    • 5.6 Predicting probabilities

      • Quadratic loss function

      • Informational loss function

      • Discussion



    • 5.7 Counting the cost

      • Cost-sensitive classification

      • Cost-sensitive learning

      • Lift charts

      • ROC curves

      • Recall–precision curves

      • Discussion

      • Cost curves



    • 5.8 Evaluating numeric prediction

    • 5.9 The minimum description length principle



  • 5.10 Applying the MDL principle to clustering

  • 5.11 Further reading

    • 6 Implementations: Real machine learning schemes

    • 6.1 Decision trees

      • Numeric attributes

      • Missing values

      • Pruning

      • Estimating error rates

      • Complexity of decision tree induction

      • From trees to rules

      • C4.5: Choices and options

      • Discussion



    • 6.2 Classification rules

      • Criteria for choosing tests

      • Missing values, numeric attributes



    • Generating good rules

    • Using global optimization

    • Obtaining rules from partial decision trees

    • Rules with exceptions

    • Discussion



  • 6.3 Extending linear models

    • The maximum margin hyperplane

    • Nonlinear class boundaries

    • Support vector regression

    • The kernel perceptron

    • Multilayer perceptrons

    • Discussion



  • 6.4 Instance-based learning

    • Reducing the number of exemplars

    • Pruning noisy exemplars

    • Weighting attributes

    • Generalizing exemplars

    • Distance functions for generalized exemplars

    • Generalized distance functions

    • Discussion



  • 6.5 Numeric prediction

    • Model trees

    • Building the tree

    • Pruning the tree

    • Nominal attributes

    • Missing values

    • Pseudocode for model tree induction

    • Rules from model trees

    • Locally weighted linear regression

    • Discussion



  • 6.6 Clustering

    • Choosing the number of clusters

    • Incremental clustering

    • Category utility

    • Probability-based clustering

    • The EM algorithm

    • Extending the mixture model

    • Bayesian clustering

    • Discussion



  • 6.7 Bayesian networks

    • Making predictions

    • Learning Bayesian networks

      • Specific algorithms

      • Data structures for fast learning

      • Discussion



    • 7 Transformations: Engineering the input and output



  • 7.1 Attribute selection

    • Scheme-independent selection

    • Searching the attribute space

    • Scheme-specific selection



  • 7.2 Discretizing numeric attributes

    • Unsupervised discretization

    • Entropy-based discretization

    • Other discretization methods

    • Entropy-based versus error-based discretization

    • Converting discrete to numeric attributes



  • 7.3 Some useful transformations

    • Principal components analysis

    • Random projections

    • Text to attribute vectors

    • Time series



  • 7.4 Automatic data cleansing

    • Improving decision trees

    • Robust regression

    • Detecting anomalies



  • 7.5 Combining multiple models

    • Bagging

    • Bagging with costs

    • Randomization

    • Boosting

    • Additive regression

    • Additive logistic regression

    • Option trees

    • Logistic model trees

    • Stacking

    • Error-correcting output codes



  • 7.6 Using unlabeled data

    • Clustering for classification

    • Co-training

    • EM and co-training



  • 7.7 Further reading

    • 8 Moving on: Extensions and applications

    • 8.1 Learning from massive datasets

    • 8.2 Incorporating domain knowledge

    • 8.3 Text and Web mining

    • 8.4 Adversarial situations

    • 8.5 Ubiquitous data mining

    • 8.6 Further reading



  • Part II The Weka machine learning workbench

    • 9 Introduction to Weka

    • 9.1 What’s in Weka?

    • 9.2 How do you use it?

    • 9.3 What else can you do?

    • 9.4 How do you get it?

    • 10 The Explorer

    • 10.1 Getting started

      • Preparing the data

      • Loading the data into the Explorer

      • Building a decision tree

      • Examining the output

      • Doing it again

      • Working with models

      • When things go wrong



    • 10.2 Exploring the Explorer

      • Loading and filtering files

      • Training and testing learning schemes

      • Do it yourself: The User Classifier

      • Using a metalearner

      • Clustering and association rules

      • Attribute selection

      • Visualization



    • 10.3 Filtering algorithms

      • Unsupervised attribute filters

      • Unsupervised instance filters

      • Supervised filters





  • 10.4 Learning algorithms

    • Bayesian classifiers

    • Trees

    • Rules

    • Functions

    • Lazy classifiers

    • Miscellaneous classifiers



  • 10.5 Metalearning algorithms

    • Bagging and randomization

    • Boosting

    • Combining classifiers

    • Cost-sensitive learning

    • Optimizing performance

    • Retargeting classifiers for different tasks



  • 10.6 Clustering algorithms

  • 10.7 Association-rule learners

  • 10.8 Attribute selection

    • Attribute subset evaluators

    • Single-attribute evaluators

    • Search methods

    • 11 The Knowledge Flow interface



  • 11.1 Getting started

  • 11.2 The Knowledge Flow components

  • 11.3 Configuring and connecting the components

  • 11.4 Incremental learning

    • 12 The Experimenter



  • 12.1 Getting started

    • Running an experiment

    • Analyzing the results



  • 12.2 Simple setup

  • 12.3 Advanced setup

  • 12.4 The Analyze panel

  • 12.5 Distributing processing over several machines

    • 13 The command-line interface



  • 13.1 Getting started

  • 13.2 The structure of Weka

    • Classes, instances, and packages

    • The weka.core package

    • The weka.classifiers package

    • Other packages

    • Javadoc indices



  • 13.3 Command-line options

    • Generic options

    • Scheme-specific options

    • 14 Embedded machine learning



  • 14.1 A simple data mining application

  • 14.2 Going through the code

    • main()

    • MessageClassifier()

    • updateData()

    • classifyMessage()

    • 15 Writing new learning schemes



  • 15.1 An example classifier

    • buildClassifier()

    • makeTree()

    • computeInfoGain()

    • classifyInstance()

    • main()



  • 15.2 Conventions for implementing classifiers

    • References

    • Index

    • About the authors



Free download pdf