Understanding Machine Learning: From Theory to Algorithms

448 Index

forward greedy selection, 360 frequentist, 353 gain, 253 GD,seegradient descent generalization error, 35 generative models, 342 Gini index, 254 Glivenko-Cantelli, 58 gradient, 158 gradient descent, 185 Gram matrix, 219 growth function, 73 halfspace, 118 homogenous, 118, 205 non-separable, 119 separable, 118 Halving, 289 hidden layers, 270 Hilbert space, 217 Hoeffding’s inequality, 56, 425 hold out, 146 hypothesis, 34 hypothesis class, 36 i.i.d., 38 ID3, 252 improper,seerepresentation independent inductive bias,seebias information bottleneck, 317 information gain, 254 instance, 33 instance space, 33 integral image, 143 Johnson-Lindenstrauss lemma, 329 k-means, 311, 313 soft k-means, 352 k-median, 312 k-medoids, 312 Kendall tau, 239 kernel PCA, 326 kernels, 215 Gaussian kernel, 220 kernel trick, 217 polynomial kernel, 220 RBF kernel, 220 label, 33 Lasso, 365, 386 generalization bounds, 386 latent variables, 348 LDA, 347 Ldim, 290, 291 learning curves, 153 least squares, 124 likelihood ratio, 348 linear discriminant analysis,seeLDA linear predictor, 117

homogenous, 118 linear programming, 119 linear regression, 122 linkage, 310 Lipschitzness, 160, 176, 191 sub-gradient, 190 Littlestone dimension,seeLdim local minimum, 158 logistic regression, 126 loss, 35 loss function, 48 0-1 loss, 48, 167 absolute value loss, 124, 128, 166 convex loss, 163 generalized hinge-loss, 233 hinge loss, 167 Lipschitz loss, 166 log-loss, 345 logistic loss, 127 ramp loss, 209 smooth loss, 166 square loss, 48 surrogate loss, 167, 302 margin, 203 Markov’s inequality, 422 Massart lemma, 380 max linkage, 310 maximum a-posteriori, 355 maximum likelihood, 343 McDiarmid’s inequality, 378 MDL, 89, 90, 251 measure concentration, 55, 422 Minimum Description Length,seeMDL mistake bound, 288 mixture of Gaussians, 348 model selection, 144, 147 multiclass, 47, 227, 402 cost-sensitive, 232 linear predictors, 230, 405 multi-vector, 231, 406 Perceptron, 248 reductions, 227, 405 SGD, 235 SVM, 234 multivariate performance measures, 243 Naive Bayes, 347 Natarajan dimension, 402 NDCG, 239 Nearest Neighbor, 258 k-NN, 258 neural networks, 268 feedforward networks, 269 layered networks, 269 SGD, 277 no-free-lunch, 61 non-uniform learning, 84

Understanding Machine Learning: From Theory to Algorithms

Get our desktop app

Company

Features

Documentation

Resources