Understanding Machine Learning: From Theory to Algorithms

(Jeff_L) #1
Index 449

Normalized Discounted Cumulative Gain,
seeNDCG
Occam’s razor, 91
OMP, 360
one-vs-all, 227
one-vs-rest,seeone-vs-all
one-vs.-all, 404
online convex optimization, 300
online gradient descent, 300
online learning, 287
optimization error, 168
oracle inequality, 179
orthogonal matching pursuit,seeOMP
overfitting, 35, 65, 152
PAC, 43
agnostic PAC, 45, 46
agnostic PAC for general loss, 49
PAC-Bayes, 415
parametric density estimation, 342
PCA, 324
Pearson’s correlation coefficient, 359
Perceptron, 120
kernelized Perceptron, 225
multiclass, 248
online, 301
permutation matrix, 242
polynomial regression, 125
precision, 244
predictor, 34
prefix free language, 89
Principal Component Analysis,seePCA
prior knowledge, 63
Probably Approximately Correct,seePAC
projection, 193
projection lemma, 193
proper, 49
pruning, 254
Rademacher complexity, 375
random forests, 255
random projections, 329
ranking, 238
bipartite, 243
realizability, 37
recall, 244
regression, 47, 122, 172
regularization, 171
Tikhonov, 172, 174
regularized loss minimization,seeRLM
representation independent, 49, 107
representative sample, 54, 375
representer theorem, 218
ridge regression, 172
kernel ridge regression, 225
RIP, 331
risk, 35, 45, 48
RLM, 171, 199


sample complexity, 44
Sauer’s lemma, 73
self-boundedness, 162
sensitivity, 244
SGD, 190
shattering, 69, 403
single linkage, 310
Singular Value Decomposition,seeSVD
Slud’s inequality, 428
smoothness, 162, 177, 198
SOA, 292
sparsity-inducing norms, 363
specificity, 244
spectral clustering, 315
SRM, 85, 145
stability, 173
Stochastic Gradient Descent,seeSGD
strong learning, 132
Structural Risk Minimization,seeSRM
structured output prediction, 236
sub-gradient, 188
Support Vector Machines,seeSVM
SVD, 431
SVM, 202, 383
duality, 211
generalization bounds, 208, 383
hard-SVM, 203, 204
homogenous, 205
kernel trick, 217
soft-SVM, 206
support vectors, 210
target set, 47
term-frequency, 231
TF-IDF, 231
training error, 35
training set, 33
true error, 35, 45
underfitting, 65, 152
uniform convergence, 54, 55
union bound, 39
unsupervised learning, 308
validation, 144, 146
cross validation, 149
train-validation-test split, 150
Vapnik-Chervonenkis dimension,seeVC
dimension
VC dimension, 67, 70
version space, 289
Viola-Jones, 139
weak learning, 130, 131
Weighted-Majority, 295
Free download pdf