Bandit Algorithms

35.2 Optimal stopping ( ) 412

Theorem35.1.For any priorQ,

lim sup n→∞

BR∗n(Q) n^1 /^2

= 0.

Furthermore, there exists a priorQsuch that for al lε > 0 ,

lim infn→∞

BR∗n(Q) n^1 /^2 −ε

=∞.

The lower bound has a worst-case flavor in the sense it holds for a specific prior. The prior that yields the lower bound is a little unnatural because it assigns the overwhelming majority of its mass to bandits with small suboptimality gaps. In particular,Q({ν∈E: ∆min(ν)≤∆})≥c/log(1/∆) for some constantc >0. For more regular priors the Bayesian optimal regret satisfiesBR∗n(Q) = Θ(log^2 (n)). See the bibliographic remarks for pointers to the literature.

35.2 Optimal stopping ( )

Let (Ut)nt=1be a sequence of random variables adapted to filtrationF= (Ft)nt=1. Optimal stopping is concerned with finding solutions to optimization problems of the following form:

sup τ∈Rn 1

E[Uτ], (35.1)

whereRn 1 is the set ofF-stopping timesτ with 1≤τ≤n. Whennis finite the situation is conceptually straightforward. The idea is to usebackwards inductionto define the expected optimal utility conditioned on the information inFtstarting fromt=nand working backwards tot= 1. TheSnell envelope is a sequence of random variables (Et)nt=1defined by

Et=

{

Un, ift=n; max{Ut,E[Et+1|Ft]}, otherwise.

Theorem35.2. Assume thatnis finite andUtis integrable for allt∈[n]. Then the stopping timeτ=min{t∈[n] :Ut=Et}achieves the supremum in Eq.(35.1).

Backwards induction is not directly applicable when the horizon is infinite. There are several standard ways around this problem. The most convenient for our purposes is to introduce a Markov structure. Let{Px:x∈S}be a Markov kernel from a Borel space (S,G) to itself andu:S →Ris aG/B(R)-measurable function. Let (St,Ut)∞t=1be a sequence of random elements taking values in (S×R,G⊗B(R)) such that the following hold:

(a)P(St+1∈·|S 1 ,...,St) =PSt(·) almost surely. (b)Ut=u(St).

Bandit Algorithms

= 0.

=∞.

35.2 Optimal stopping ( )

{

Get our desktop app

Company

Features

Documentation

Resources