Bandit Algorithms

BIBLIOGRAPHY 514

Y. Abbasi-Yadkori, P. Bartlett, V. Gabillon, A. Malek, and M. Valko. Best of
both worlds: Stochastic & adversarial best-arm identification. InConference
on Learning Theory, 2018. [390]
N. Abe and P. M. Long. Associative reinforcement learning using linear
probabilistic concepts. InProceedings of the 16th International Conference on
Machine Learning, ICML, pages 3–11, San Francisco, CA, USA, 1999. Morgan
Kaufmann Publishers Inc. [235]
M. Abeille and A. Lazaric. Linear Thompson sampling revisited. In A. Singh and
J. Zhu, editors,Proceedings of the 20th International Conference on Artificial
Intelligence and Statistics, volume 54 ofProceedings of Machine Learning
Research, pages 176–184, Fort Lauderdale, FL, USA, 20–22 Apr 2017a. PMLR.
[445]
M. Abeille and A. Lazaric. Thompson sampling for linear-quadratic control
problems. In A. Singh and J. Zhu, editors,Proceedings of the 20th International
Conference on Artificial Intelligence and Statistics, volume 54 ofProceedings
of Machine Learning Research, pages 1246–1254, Fort Lauderdale, FL, USA,
20–22 Apr 2017b. PMLR. [502]
J. D. Abernethy and A. Rakhlin. Beating the adaptive bandit with high probability.
InCOLT, 2009. [164, 323]
J. D. Abernethy, E. Hazan, and A. Rakhlin. Competing in the dark: An efficient
algorithm for bandit linear optimization. InProceedings of the 21st Annual
Conference on Learning Theory, pages 263–274. Omnipress, 2008. [323]
J. D. Abernethy, E. Hazan, and A. Rakhlin. Interior-point methods for full-
information and bandit online learning.IEEE Transactions on Information
Theory, 58(7):4164–4175, 2012. [164, 320]
J. D. Abernethy, C. Lee, A. Sinha, and A. Tewari. Online linear optimization
via smoothing. In M. F. Balcan, V. Feldman, and Cs. Szepesv ́ari, editors,
Proceedings of The 27th Conference on Learning Theory, volume 35 of
Proceedings of Machine Learning Research, pages 807–823, Barcelona, Spain,
13–15 Jun 2014. PMLR. [350]
J. D. Abernethy, C. Lee, and A. Tewari. Fighting bandits with a new kind of
smoothness. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and
R. Garnett, editors,Advances in Neural Information Processing Systems 28,
NIPS, pages 2197–2205. Curran Associates, Inc., 2015. [323, 350]
M. Abramowitz and I. A. Stegun. Handbook of mathematical functions: with
formulas, graphs, and mathematical tables, volume 55. Courier Corporation,

[176, 446]
L. Adelman. Choice theory. In Saul I. Gass and Michael C. Fu, editors,
Encyclopedia of Operations Research and Management Science, pages 164–

Springer US, Boston, MA, 2013. [65]
A. Agarwal, D. P. Foster, D. J. Hsu, S. M. Kakade, and A. Rakhlin. Stochastic
convex optimization with bandit feedback. In J. Shawe-Taylor, R. S. Zemel,
P. L. Bartlett, F. Pereira, and K. Q. Weinberger, editors,Advances in Neural

Bandit Algorithms

BIBLIOGRAPHY 514

Get our desktop app

Company

Features

Documentation

Resources