Index
1-armed bandit, 9, 68, 116
Bayesian, 413–417
χ-squared distance, 184 , 194
σ-algebra, 20
restriction of, 41
a.s., 34
Abel sum, 498
absolutely continuous, 38
action space, 476
adapted, 27
AdaUCB, 124 , 434
admissible, 394
affine hull, 453
almost surely, 34
anytime, 90
Exp3, 325
Exp3-IX, 163
Exp4, 220
MOSS, 124
UCB, 112
arithmetic coding, 178
Assouad’s method, 194
asymptotic optimality
k-armed lower bounds, 197–198
k-armed upper bounds, 112, 124, 131
best arm identification, 378–384
linear bandits, 280–284
partial monitoring, 473
ranking, 374
Thompson sampling, 442
Bachelier-Levy formula, 126
bandits with expert advice, 217
Bayesian bandit environment, 403
Bayesian optimal policy, 405
1-armed bandit, 414
discounted bandit, 420
Bayesian regret, 59, 405
Bayesian upper confidence bound algo-
rithm, 442
Bellman optimality equation, 481
Bernoulli bandit, 56, 116
Bernoulli distribution, 33
Bernstein’s inequality, 82 , 501
empirical, 106, 109
beta distribution, 400
bias-variance tradeoff, 213
bits, 178
Borel space, 45
Borel-measurable functions, 22
boundary, 298
Bregman divergence,292–293, 310
Brownian motion, 125, 361
canonical bandit model, 188
k-armed stochastic, 62
Bayesian, 408
contextual, 67
infinite-armed stochastic, 64
Carath ́eodory’s extension theorem, 24
cardinal optimization, 389
cascade model, 364
categorical distribution, 83, 457, 491
Catoni’s estimator, 111
cell decomposition, 453
Ces`aro sum, 481, 498
Chernoff bound, 128 , 372
click model, 363
closed set, 298
complement, 20