Bandit Algorithms

(Jeff_L) #1
21.4 Exercises 256

21.4 Exercises


21.1(Derivative of log determinant) Prove the correctness of the derivative
in Eq. (21.4).

Hint For square matrixAletadj(A) be the transpose of the cofactor matrix
ofA. Use the facts that the inverse of a matrixAisA−^1 =adj(A)>/det(A) and
that ifA:R→Rd×d, then
d
dt

det(A(t)) = trace

(


adj(A)

d
dt

A(t)

)


.


21.2(Concavity of log determinant) Prove thatH7→log det(H) is concave
whereHis a symmetric, positive definite matrix.

Hint Considert7→log det(H+tZ) forZsymmetric and show that this is a
concave function.
21.3(Kiefer–Wolfowitz for compact sets) Generalize the proof of
Theorem 21.1 to compact action sets.

21.4 Prove the second inequality in Eq. (21.8).

21.5 Letπ∗be aG-optimal design anda∈Supp(π∗). Prove that‖a‖^2 V(π)− 1 =d.

21.6(Implementation) Write a program that accepts as parameters a finite
setA⊂Rdand returns a designπ:A→[0,1] such thatg(π)≤d+εfor some
givenε >0. How robust is your algorithm? Experiment with different choices of
Aanddand report your results.

Hint The easiest pure way to do this is to implement the Frank-Wolfe algorithm
described in Note 3. All quantities can be updated incrementally using rank-one
update formulas and this will lead to a significant speedup. You might like to read
the third chapter of the book by Todd [2016] and experiment with the proposed
variants.
Free download pdf