9.5 Exercises 125(a)Show that for all 1-subgaussian bandits this new policy suffers regret at most
Rn≤C
∑
i:∆i> 0∆i+1
∆ilog+(n∆^2 i)
,
whereC >0 is a universal constant.
(b)Under the same conditions as the previous part show there exists a universal
constantC >0 such that
Rn≤C√
knlog(k) +∑ki=1∆i.(c) Repeat parts (a) and (b) using the indexμˆi(t−1) +√
4
Ti(t−1)log+(
t
Ti(t−1))
.
9.4(Gaussian noise and the tangent approximation) Letg(t) =at+b
withb >0 and
u(x,t) =1
√
2 πtexp(
−
x^2
2 t)
−
1
√
2 πtexp(
− 2 ab−(x− 2 b)^2
2 t)
(a) Show thatu(x,t)>0 forx∈(−∞,g(t)) andu(x,t) = 0 forx=g(t).
(b) Show thatu(x,t) satisfies the heat equation:
∂tu(x,t) =1
2 ∂
2
xu(x,t).
(c)LetBtbe a standard Brownian motion, which for any fixedthas density
with respect to the Lebesgue measure.p(x,t) =√^1
2 πtexp(
−
x^2
2 t)
.
Defineτ=min{t:Bt=g(t)}be the first time the Brownian motion hits the
boundary. Put on your physicists hat (or work hard) to argue thatP(τ≥t) =∫g(t)−∞u(x,t)dx.(d)Letv(t) be the density of timeτwith respect to the Lebesgue measure so
thatP(τg≤t) =
∫t
0 v(t)dt. Show thatv(t) =√b
2 πt^3exp(
−g(t)2
2 t)
(e)In the last part you established the exact density of the hitting time of a
Brownian motion approaching a linear boundary. We now generalize this
to nonlinear boundaries, but at the cost that now we only have a bound.
Suppose thatf : [0,∞)→[0,∞) is concave and differentiable and let
λ:R→Rbe the intersection of the tangent tofattwith they-axis given