Bandit Algorithms

BIBLIOGRAPHY 522

O. Catoni. Challenging the empirical mean and empirical variance: a deviation
study.Annales de l’Institut Henri Poincar ́e, Probabilit ́es et Statistiques, 48(4):
1148–1185, 2012. [111]
N. Cesa-Bianchi and G. Lugosi. Prediction, learning, and games. Cambridge
university press, 2006. [15, 152, 322, 360, 472]
N. Cesa-Bianchi and G. Lugosi. Combinatorial bandits.Journal of Computer
and System Sciences, 78(5):1404–1422, 2012. [350]
N. Cesa-Bianchi, G. Lugosi, and G. Stoltz. Regret minimization under partial
monitoring.Mathematics of Operations Research, 31:562–580, 2006. [472]
N. Cesa-Bianchi, C. Gentile, Y. Mansour, and A. Minora. Delay and cooperation
in nonstochastic bandits. In Vitaly Feldman, Alexander Rakhlin, and Ohad
Shamir, editors,29th Annual Conference on Learning Theory, volume 49 of
Proceedings of Machine Learning Research, pages 605–622, Columbia University,
New York, New York, USA, 23–26 Jun 2016. PMLR. [339]
N. Cesa-Bianchi, C. Gentile, G. Lugosi, and G. Neu. Boltzmann exploration
done right. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus,
S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information
Processing Systems 30, pages 6284–6293. Curran Associates, Inc., 2017. [91]
J. Chakravorty and A. Mahajan. Multi-armed bandits, Gittins index, and its
calculation.Methods and Applications of Statistics in Clinical Trials: Planning,
Analysis, and Inferential Methods, Volume 2, pages 416–435, 2013. [427]
J. Chakravorty and A. Mahajan. Multi-armed bandits, Gittins index, and its
calculation.Methods and Applications of Statistics in Clinical Trials: Planning,
Analysis, and Inferential Methods, Volume 2, pages 416–435, 2014. [427]
H. P. Chan and T. L. Lai. Sequential generalized likelihood ratios and adaptive
treatment allocation for optimal sequential selection.Sequential Analysis, 25:
179–201, 2006. [389]
J. T. Chang and D. Pollard. Conditioning as disintegration.Statistica Neerlandica,
51(3):287–317, 1997. [407]
O. Chapelle and L. Li. An empirical evaluation of Thompson sampling. In
J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger,
editors,Advances in Neural Information Processing Systems 24, NIPS, pages
2249–2257. Curran Associates, Inc., 2011. [444]
Chun-Hung Chen, Jianwu Lin, Enver Y ̈ucesan, and Stephen E. Chick. Simulation
budget allocation for further enhancing the efficiency of ordinal optimization.
Discrete Event Dynamic Systems, 10(3):251–270, 2000. [389]
S. Chen, T. Lin, I. King, M. R. Lyu, and W. Chen. Combinatorial pure exploration
of multi-armed bandits. In Z. Ghahramani, M. Welling, C. Cortes, N. D.
Lawrence, and K. Q. Weinberger, editors,Advances in Neural Information
Processing Systems 27, pages 379–387. Curran Associates, Inc., 2014. [388]
W. Chen, Y. Wang, and Y. Yuan. Combinatorial multi-armed bandit: General
framework and applications. InInternational Conference on Machine Learning,
pages 151–159, 2013. [350]

Bandit Algorithms

BIBLIOGRAPHY 522

Get our desktop app

Company

Features

Documentation

Resources