Bandit Algorithms

(Jeff_L) #1

BIBLIOGRAPHY 546


M. Zoghi, S. Whiteson, R. Munos, and M. Rijke. Relative upper confidence
bound for the k-armed dueling bandit problem. In E. P. Xing and T. Jebara,
editors,Proceedings of the 31st International Conference on Machine Learning,
volume 32 ofProceedings of Machine Learning Research, pages 10–18, Bejing,
China, 22–24 Jun 2014. PMLR. [337]
M. Zoghi, Z. Karnin, S. Whiteson, and M. Rijke. Copeland dueling bandits. In
C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors,
Advances in Neural Information Processing Systems 28, NIPS, pages 307–315.
Curran Associates, Inc., 2015. [337]
M. Zoghi, T. Tunys, M. Ghavamzadeh, B. Kveton, Cs. Szepesv ́ari, and Z. Wen.
Online learning to rank in stochastic click models. InProceedings of the 34th
International Conference on Machine Learning, volume 70 ofPMLR, pages
4199–4208, 2017. [374]
S. Zong, H. Ni, K. Sung, R. N. Ke, Z. Wen, and B. Kveton. Cascading bandits for
large-scale recommendation problems. InProceedings of the 32nd Conference
on Uncertainty in Artificial Intelligence, UAI, 2016. [372, 374]

Free download pdf