Bandit Algorithms

Index

1-armed bandit, 9, 68, 116
Bayesian, 413–417
χ-squared distance, 184 , 194
σ-algebra, 20
restriction of, 41
a.s., 34
Abel sum, 498
absolutely continuous, 38
action space, 476
adapted, 27
AdaUCB, 124 , 434
admissible, 394
affine hull, 453
almost surely, 34
anytime, 90
Exp3, 325
Exp3-IX, 163
Exp4, 220
MOSS, 124
UCB, 112
arithmetic coding, 178
Assouad’s method, 194
asymptotic optimality
k-armed lower bounds, 197–198
k-armed upper bounds, 112, 124, 131
best arm identification, 378–384
linear bandits, 280–284
partial monitoring, 473
ranking, 374
Thompson sampling, 442
Bachelier-Levy formula, 126
bandits with expert advice, 217
Bayesian bandit environment, 403
Bayesian optimal policy, 405

1-armed bandit, 414 discounted bandit, 420 Bayesian regret, 59, 405 Bayesian upper confidence bound algo- rithm, 442 Bellman optimality equation, 481 Bernoulli bandit, 56, 116 Bernoulli distribution, 33 Bernstein’s inequality, 82 , 501 empirical, 106, 109 beta distribution, 400 bias-variance tradeoff, 213 bits, 178 Borel space, 45 Borel-measurable functions, 22 boundary, 298 Bregman divergence,292–293, 310 Brownian motion, 125, 361 canonical bandit model, 188 k-armed stochastic, 62 Bayesian, 408 contextual, 67 infinite-armed stochastic, 64 Carath ́eodory’s extension theorem, 24 cardinal optimization, 389 cascade model, 364 categorical distribution, 83, 457, 491 Catoni’s estimator, 111 cell decomposition, 453 Ces`aro sum, 481, 498 Chernoff bound, 128 , 372 click model, 363 closed set, 298 complement, 20

Bandit Algorithms

Index

Get our desktop app

Company

Features

Documentation

Resources