4.10 Exercises 70
200 400 600 800 1 , 000
10
20
30
40
50
n
Expected Regret
Follow-the-Leader
Figure 4.3The regret for Follow-the-Leader over 1000 trials on Bernoulli bandit with
meansμ 1 = 0. 5 ,μ 2 = 0.6 and horizons ranging fromn= 100 ton= 1000.
(c)Explain the plot. Do you think Follow-the-Leader is a good algorithm?
Why/why not?