Data Mining: Practical Machine Learning Tools and Techniques, Second Edition
tionships among more people would require a larger table. Relationships in which the maximum number of people is not specified i ...
The real drawbacks of such techniques, however, are that they do not cope well with noisy data, and they tend to be so slow as t ...
instances have different features? If the instances were transportation vehicles, then number of wheels is a feature that applie ...
Notice that the distinction between nominal and ordinal quantities is not always straightforward and obvious. Indeed, the very e ...
which has only two members—often designated as trueand false,or yesand no in the weather data. Such attributes are sometimes cal ...
cleaned up. The idea of company wide database integration is known as data warehousing.Data warehouses provide a single consiste ...
54 CHAPTER 2| INPUT: CONCEPTS, INSTANCES, AND ATTRIBUTES % ARFF file for the weather data with some numeric features % @relation ...
missing values in this dataset). The attribute specifications in ARFF files allow the dataset to be checked to ensure that it co ...
matter how big the shopping expedition, customers never purchase more than a tiny portion of the items a store offers. The marke ...
dividing by the range between the maximum and the minimum values. Another normalization technique is to calculate the statistica ...
which is a more compact, and hence more satisfactory, way of saying the same thing. Missing values Most datasets encountered in ...
a record of which values are “missing” is all that is needed for a complete diagnosis—the actual values can be ignored completel ...
buyer card when the customer does not supply one, either so that the customer can get a discount that would otherwise be unavail ...
Most of the techniques in this book produce easily comprehensible descriptions of the structural patterns in the data. Before lo ...
relations among the values of the attributes of different instances. Special forms of trees can be used for numeric prediction, ...
Alternatively, a three-way split may be used, in which case there are several dif- ferent possibilities. Ifmissing valueis treat ...
64 CHAPTER 3| OUTPUT: KNOWLEDGE REPRESENTATION (a) (b) Figure 3.1Constructing a decision tree interactively: (a) creating a rect ...
nated by only two virginicas); the right-hand one contains predominantly two types (Iris setosa and virginica,contaminated by on ...
Then it is necessary to break the symmetry and choose a single test for the root node. If, for example,ais chosen, the second ru ...
In this example the rules are not notably more compact than the tree. In fact, they are just what you would get by reading rules ...
«
1
2
3
4
5
6
7
8
9
10
»
Free download pdf