Bandit Algorithms

(Jeff_L) #1
20.1 Martingales and the method of mixtures 243

− 5 0 5

0

1

s= 1

− 5 0 5

0

1

s= 3

Figure 20.1The plots depict Laplace’s approximation withf(x) =cos(x)exp(−x^2 /20),
which is maximized atx 0 = 0 and hasq=−f′′(x 0 ) = 11/10. The solid line is a plot of
exp(sf(x))/exp(sf(x 0 )) and the dotted line is exp(−sq(x−x 0 )^2 ).

whereR(x) =o((x−x 0 )^2 ). Under appropriate technical assumptions,

Is∼exp(sf(x 0 ))

∫b

a

exp

(



sq(x−x 0 )^2
2

)


dx ass→∞.

Furthermore, assgets large
∫b

a

exp

(



sq(x−x 0 )^2
2

)


dx∼

∫∞


−∞

exp

(



sq(x−x 0 )^2
2

)


dx=


2 π
sq
and hence

Is∼exp(sf(x 0 ))


2 π
sq

.


It should also be clear that the fact that we integrate with respect the Lebesgue
measure does not matter much. We could have integrated with respect to any
other measure as long as that measure puts a positive mass on the neighborhood
of the maximizer. The method is illustrated in Fig. 20.1. The take home message is
that if we integrate the exponential of a function that has a pronounced maximum,
then we can expect that the integral will be close to the exponential function of
the maximum.

20.1.2 Method of mixtures


Laplace’s approximation suggests that

maxx Mt(x)≈


Rd

Mt(x)dh(x), (20.7)

wherehis some measure onRdchosen so that the integral can be calculated
in closed form. This is not a requirement of the method, but it does make the
argument shorter. The main benefit of replacing the maximum with an integral
is that we obtain the following lemma, which you will prove in Exercise 20.6.
Free download pdf