Bandit Algorithms

BIBLIOGRAPHY 540

A. Sani, A. Lazaric, and R. Munos. Risk-aversion in multi-armed bandits.
In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors,
Advances in Neural Information Processing Systems 25, pages 3275–3283.
Curran Associates, Inc., 2012. [66]
Y. Seldin and G. Lugosi. An improved parametrization and analysis of the
EXP3++ algorithm for stochastic and adversarial bandits. In S. Kale and
O. Shamir, editors,Proceedings of the 2017 Conference on Learning Theory,
volume 65 ofProceedings of Machine Learning Research, pages 1743–1759,
Amsterdam, Netherlands, 07–10 Jul 2017. PMLR. [153]
Y. Seldin and A. Slivkins. One practical algorithm for both stochastic and
adversarial bandits. In E. P. Xing and T. Jebara, editors,Proceedings of the
31st International Conference on Machine Learning, volume 32 ofProceedings
of Machine Learning Research, pages 1287–1295, Bejing, China, 22–24 Jun

PMLR. [153]
S. Shalev-Shwartz.Online learning: Theory, algorithms, and applications. PhD
thesis, The Hebrew University of Jerusalem, 2007. [322]
S. Shalev-Shwartz and S. Ben-David.Understanding Machine Learning: From
Theory to Algorithms. Cambridge University Press, 2009. [222, 223, 226]
S. Shalev-Shwartz and Y. Singer. A primal-dual perspective of online learning
algorithms.Machine Learning, 69(2-3):115–142, 2007. [322]
O. Shamir. On the complexity of bandit and derivative-free stochastic convex
optimization. In S. Shalev-Shwartz and I. Steinwart, editors,COLT, volume 30
ofJMLR Workshop and Conference Proceedings, pages 3–24. JMLR.org, 2013.
[338, 388]
O. Shamir. On the complexity of bandit linear optimization. In P. Gr ̈unwald,
E. Hazan, and S. Kale, editors,Proceedings of The 28th Conference on Learning
Theory, volume 40 ofProceedings of Machine Learning Research, pages 1523–
1551, Paris, France, 03–06 Jul 2015. PMLR. [278, 334]
T. Sharot. The optimism bias.Current Biology, 21(23):R941–R945, 2011a. [105,
106]
T. Sharot. The optimism bias: A tour of the irrationally positive brain.
Pantheon/Random House, 2011b. [106]
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche,
J. Schrittwieser, I. Antonoglou, V. Panneershelvam, and M. Lanctot. Mastering
the game of go with deep neural networks and tree search.Nature, 529(7587):
484–489, 2016. [8]
S. D. Silvey and B. Sibson. Discussion of dr. wynn’s and of dr. laycock’s papers.
Journal of Royal Statistical Society (B), 34:174–175, 1972. [255]
M. Sion. On general minimax theorems.Pacific Journal of mathematics, 8(1):
171–176, 1958. [322]
A. Slivkins. Contextual bandits with similarity information.Journal of Machine
Learning Research, 15(1):2533–2568, 2014. [337]
A. Slivkins.Introduction to Multi-Armed Bandits. TBD, 2018. [15, 337]

Bandit Algorithms

BIBLIOGRAPHY 540

Get our desktop app

Company

Features

Documentation

Resources