9.5 Exercises 125
(a)Show that for all 1-subgaussian bandits this new policy suffers regret at most
Rn≤C
∑
i:∆i> 0
∆i+
1
∆i
log+(n∆^2 i)
,
whereC >0 is a universal constant.
(b)Under the same conditions as the previous part show there exists a universal
constantC >0 such that
Rn≤C
√
knlog(k) +
∑k
i=1
∆i.
(c) Repeat parts (a) and (b) using the index
μˆi(t−1) +
√
4
Ti(t−1)
log+
(
t
Ti(t−1)
)
.
9.4(Gaussian noise and the tangent approximation) Letg(t) =at+b
withb >0 and
u(x,t) =
1
√
2 πt
exp
(
−
x^2
2 t
)
−
1
√
2 πt
exp
(
− 2 ab−
(x− 2 b)^2
2 t
)
(a) Show thatu(x,t)>0 forx∈(−∞,g(t)) andu(x,t) = 0 forx=g(t).
(b) Show thatu(x,t) satisfies the heat equation:
∂tu(x,t) =
1
2 ∂
2
xu(x,t).
(c)LetBtbe a standard Brownian motion, which for any fixedthas density
with respect to the Lebesgue measure.
p(x,t) =
√^1
2 πt
exp
(
−
x^2
2 t
)
.
Defineτ=min{t:Bt=g(t)}be the first time the Brownian motion hits the
boundary. Put on your physicists hat (or work hard) to argue that
P(τ≥t) =
∫g(t)
−∞
u(x,t)dx.
(d)Letv(t) be the density of timeτwith respect to the Lebesgue measure so
thatP(τg≤t) =
∫t
0 v(t)dt. Show that
v(t) =√b
2 πt^3
exp
(
−g(t)
2
2 t
)
(e)In the last part you established the exact density of the hitting time of a
Brownian motion approaching a linear boundary. We now generalize this
to nonlinear boundaries, but at the cost that now we only have a bound.
Suppose thatf : [0,∞)→[0,∞) is concave and differentiable and let
λ:R→Rbe the intersection of the tangent tofattwith they-axis given