Understanding Machine Learning: From Theory to Algorithms

17.4 Ranking 241 In our case,y′i−yj′=〈w,xi−xj〉. It follows that we can use the hinge loss upper bound as follows: (^1) [sign(yi− ...

242 Multiclass, Ranking, and Complex Prediction Problems equivalent to solving the problem argmin v∈V ∑r i=1 (αivi+βiD(vi)), whe ...

17.5 Bipartite Ranking and Multivariate Performance Measures 243 The following claim states that every doubly stochastic matrix ...

244 Multiclass, Ranking, and Complex Prediction Problems problem stems from the inadequacy of the zero-one loss for what we are ...

17.5 Bipartite Ranking and Multivariate Performance Measures 245 Fβ= (1+β (^2) )a (1+β^2 )a+b+β^2 c. Again, we setθ= 0, and the ...

246 Multiclass, Ranking, and Complex Prediction Problems Once we have definedbas in Equation (17.13), we can easily derive a con ...

17.6 Summary 247 Solving Equation (17.14) input: (x 1 ,...,xr),(y 1 ,...,yr),w,V,∆ assumptions: ∆ is a function ofa, b, c, d Vco ...

248 Multiclass, Ranking, and Complex Prediction Problems in (Daniely et al. 2011, Daniely, Sabato & Shwartz 2012). See also ...

17.8 Exercises 249 Multiclass Batch Perceptron Input: A training set (x 1 ,y 1 ),...,(xm,ym) A class-sensitive feature mapping Ψ ...

18 Decision Trees A decision tree is a predictor,h:X →Y, that predicts the label associated with an instancexby traveling from a ...

18.1 Sample Complexity 251 18.1 Sample Complexity A popular splitting rule at internal nodes of the tree is based on thresholdin ...

252 Decision Trees Overall, there ared+ 3 options, hence we need log 2 (d+ 3) bits to describe each block. Assuming each interna ...

18.2 Decision Tree Algorithms 253 and therefore all splitting rules are of the form (^1) [xi=1]for some featurei∈[d]. We discuss ...

254 Decision Trees Information Gain: Another popular gain measure that is used in the ID3 and C4.5 algorithms of Quinlan (1993) ...

18.3 Random Forests 255 18.2.3 Threshold-Based Splitting Rules for Real-Valued Features In the previous section we have describe ...

256 Decision Trees the algorithmAgrows a decision tree (e.g., using the ID3 algorithm) based on the sampleS′, where at each spli ...

18.6 Exercises 257 Suppose we run the ID3 algorithm up to depth 2 (namely, we pick the root node and its children according to ...

19 Nearest Neighbor Nearest Neighbor algorithms are among the simplest of all machine learning algorithms. The idea is to memori ...

19.2 Analysis 259 Figure 19.1An illustration of the decision boundaries of the 1-NN rule. The points depicted are the sample poi ...

260 Nearest Neighbor goes to infinity, and the rate of convergence depends on the underlying distribu- tion. As we have argued i ...

«
8
9
10
11
12
13
14
15
16
17
»

Free download pdf

Get our desktop app

Company

Features

Documentation

Resources