Data Mining: Practical Machine Learning Tools and Techniques, Second Edition
228 CHAPTER 6| IMPLEMENTATIONS: REAL MACHINE LEARNING SCHEMES where f(x) is the network’s prediction obtained from the output un ...
6.3 EXTENDING LINEAR MODELS 229 Gradient descent exploits information given by the derivative of the function that is to be mini ...
The learning rate determines the step size and hence how quickly the search converges. If it is too large and the error function ...
6.3 EXTENDING LINEAR MODELS 231 So far so good. But all this assumes that there is no hidden layer. With a hidden layer, things ...
Furthermore, This means that we are finished. Putting everything together yields an equation for the derivative of the error fun ...
6.3 EXTENDING LINEAR MODELS 233 minimum. It can be used for online learning, in which new data arrives in a continuous stream an ...
Radial basis function networks Another popular type of feedforward network is the radial basis function (RBF) network. It has tw ...
6.4 INSTANCE-BASED LEARNING 235 Discussion Support vector machines originated from research in statistical learning theory (Vapn ...
extreme case, when some attributes are completely irrelevant—because all attributes contribute equally to the distance formula. ...
6.4 INSTANCE-BASED LEARNING 237 the exemplar set. If its performance exceeds the upper threshold, it is used for predicting the ...
evant to the outcome. Such domains, however, are the exception rather than the rule. In most domains some attributes are irrelev ...
6.4 INSTANCE-BASED LEARNING 239 is necessary to modify the distance function as described below to allow the dis- tance to a hyp ...
Figure 6.14 shows the implicit boundaries that are formed between two rec- tangular classes if the distance metric is adjusted t ...
6.4 INSTANCE-BASED LEARNING 241 third region is where the boundary meets the lower border of the larger rec- tangle when project ...
boundaries rather than the hard-edged cutoff implied by the k-nearest-neighbor rule, in which any particular example is either “ ...
6.5 NUMERIC PREDICTION 243 eralization is. Salzberg (1991) suggested that generalization with nested exem- plars can achieve a h ...
Following an extensive description of model trees, we briefly explain how to generate rules from model trees, and then describe ...
6.5 NUMERIC PREDICTION 245 Building the tree The splitting criterion is used to determine which attribute is the best to split t ...
The expected error for test data at a node is calculated as described previ- ously, using the linear model for prediction. Becau ...
6.5 NUMERIC PREDICTION 247 where mis the number of instances without missing values for that attribute, and Tis the set of insta ...
«
9
10
11
12
13
14
15
16
17
18
»
Free download pdf