Bandit Algorithms

(Jeff_L) #1

BIBLIOGRAPHY 527


S. Gerchinovitz. Sparsity regret bounds for individual sequences in online linear
regression. Journal of Machine Learning Research, 14(Mar):729–769, 2013.
[269]
S. Gerchinovitz and T. Lattimore. Refined lower bounds for adversarial bandits.
In D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, editors,
Advances in Neural Information Processing Systems 29, NIPS, pages 1198–1206.
Curran Associates, Inc., 2016. [193, 209]
S. Ghosal and A. van der Vaart. Fundamentals of nonparametric Bayesian
inference, volume 44. Cambridge University Press, 2017. [407]
J. Gittins. Bandit processes and dynamic allocation indices.Journal of the Royal
Statistical Society. Series B (Methodological), 41(2):148–177, 1979. [360, 426,
427]
J. Gittins, K. Glazebrook, and R. Weber.Multi-armed bandit allocation indices.
John Wiley & Sons, 2011. [15, 360, 426]
P. Glynn and S. Juneja. Ordinal optimization – empirical large deviations
rate estimators, and stochastic multi-armed bandits.arXiv preprint: arXiv
1507.04564, 2015. [390]
D. Goldsman. Ranking and selection in simulation. In15th conference on Winter
Simulation, pages 387–394, 1983. [389]
A. Gopalan and S. Mannor. Thompson sampling for learning parameterized
Markov decision processes. In P. Gr ̈unwald, E. Hazan, and S. Kale, editors,
Proceedings of The 28th Conference on Learning Theory, volume 40 of
Proceedings of Machine Learning Research, pages 861–898, Paris, France, 03–06
Jul 2015. PMLR. [444]
T. Graepel, J. Q. Candela, T. Borchert, and R. Herbrich. Web-scale Bayesian
click-through rate prediction for sponsored search advertising in microsoft’s
bing search engine. InProceedings of the 27th International Conference on
International Conference on Machine Learning, ICML, pages 13–20, USA, 2010.
Omnipress. [444]
O. Granmo. Solving two-armed bernoulli bandit problems using a Bayesian
learning automaton. International Journal of Intelligent Computing and
Cybernetics, 3(2):207–234, 2010. [444]
R. M. Gray.Entropy and information theory. Springer Science & Business Media,



  1. [185]
    K. Greenewald, A. Tewari, S. Murphy, and P. Klasnja. Action centered contextual
    bandits. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus,
    S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information
    Processing Systems 30, pages 5977–5985. Curran Associates, Inc., 2017. [16]
    M. Gr ̈otschel, L. Lov ́asz, and A. Schrijver.Geometric algorithms and combinatorial
    optimization, volume 2. Springer Science & Business Media, 2012. [255, 500]
    F. Guo, C. Liu, and Y. M. Wang. Efficient multiple-click models in web search.
    InProceedings of the 2nd ACM International Conference on Web Search and
    Data Mining, pages 124–131. ACM, 2009. [374]

Free download pdf