Bandit Algorithms

19.3 Regret analysis 232

The proof of Theorem 19.2 depends on the following lemma.
Lemma19.4.LetV 0 be positive definite andx 1 ,...,xn∈Rdbe a sequence of
vectors with‖xt‖ 2 ≤L <∞for allt∈[n]. Then

∑n

t=1

(

1 ∧‖xt‖^2 Vt−− (^11)

)

≤2 log

(

detVn detV 0

)

≤ 2 dlog

(

traceV 0 +nL^2 ddet^1 /d(V 0 )

)

.

Proof Using that for anyu∈[0,1],u∧ 1 ≤2 ln(1 +u), we get ∑n

t=1

(

1 ∧‖xt‖^2 Vt−− (^11)

)

≤ 2

∑

t

log

(

1 +‖xt‖^2 Vt−− (^11)

)

.

We now argue that this last expression is log

(

detVn detV 0

)

. Fort≥1 we have

Vt=Vt− 1 +xtx>t =Vt^1 −/ 12 (I+Vt−−^11 /^2 xtx>tVt−−^11 /^2 )Vt^1 −/ 12. Hence det(Vt) = det(Vt− 1 ) det

(

I+Vt−−^11 /^2 xtx>tVt−−^11 /^2

)

= det(Vt− 1 )

(

1 +‖xt‖^2 V− 1 t− 1

)

,

where the second equality follows because the matrixI+yy>has eigenvalues
1 +‖y‖^22 and 1 as well as the fact that the determinant of a matrix is the product
of its eigenvalues. Putting things together we see that

det(Vn) = det(V 0 )

∏n

t=1

(

1 +‖xt‖^2 Vt−− (^11)

)

,

which is equivalent to the first inequality that we wanted to prove. To get the
second inequality note that by the inequality of arithmetic and geometric means,

det(Vn) =

∏d

i=1

λi≤

(

1

d

traceVn

)d ≤

(

traceV 0 +nL^2 d

)d ,

whereλ 1 ,...,λdare the eigenvalues ofVn.

Proof of Theorem 19.2 By part (c) of Assumption 19.1 it suffices to prove the
bound on the event thatθ∗∈Ctfor all roundst∈[n]. LetA∗t=argmaxa∈At〈a,θ∗〉
be an optimal action for roundtandrtbe the instantaneous regret in roundt
defined by
rt=〈A∗t−At,θ∗〉.
Letθ ̃t∈Ctbe the parameter in the confidence set for which〈At,θ ̃t〉=UCBt(At).
Then using the fact thatθ∗∈Ctand the definition of the algorithm leads to

〈A∗t,θ∗〉≤UCBt(A∗t)≤UCBt(At) =〈At,θ ̃t〉. Using Cauchy-Schwarz inequality and the assumption thatθ∗∈Ctand facts that θ ̃t∈CtandCt⊆Etleads to

rt=〈At∗−At,θ∗〉≤〈At,θ ̃t−θ∗〉≤‖At‖Vt−− 11 ‖θ ̃t−θ∗‖Vt− 1 ≤ 2 ‖At‖Vt−− (^11)

√

βt.

Bandit Algorithms

(

)

(

)

(

)

.

(

)

≤ 2

∑

(

)

.

(

)

(

)

(

)

,

(

)

,

(

1

(

√

Get our desktop app

Company

Features

Documentation

Resources