Understanding Machine Learning: From Theory to Algorithms

10.7 Exercises 143

Show that the error ofhtw.r.t. the distributionD(t+1)is exactly 1/2. That is, show that for everyt∈[T]

∑m

i=1

D(it+1) (^1) [yi 6 =ht(xi)]= 1/ 2.

In this exercise we discuss the VC-dimension of classes of the formL(B,T).
We proved an upper bound ofO(dTlog(dT)), whered= VCdim(B). Here we
wish to prove an almost matching lower bound. However, that will not be the
case for all classesB.

Note that for every classB and every numberT ≥ 1, VCdim(B) ≤
VCdim(L(B,T)). Find a classBfor which VCdim(B) = VCdim(L(B,T))
for everyT≥1.
Hint:TakeXto be a finite set.

LetBdbe the class of decision stumps overRd. Prove that log(d) ≤
VCdim(Bd)≤5 + 2 log(d).
Hints:
- For the upper bound, rely on Exercise 11.
- For the lower bound, assumed= 2k. LetAbe ak×dmatrix whose
  columns are all thedbinary vectors in{± 1 }k. The rows ofAform
  a set ofkvectors inRd. Show that this set is shattered by decision
  stumps overRd.

LetT≥1 be any integer. Prove that VCdim(L(Bd,T))≥ 0. 5 Tlog(d).
Hint:Construct a set ofT 2 kinstances by taking the rows of the matrixA
from the previous question, and the rows of the matrices 2A, 3 A, 4 A,...,T 2 A.
Show that the resulting set is shattered byL(Bd,T).
5.Efficiently Calculating the Viola and Jones Features Using an Inte-
gral Image:LetAbe a 24×24 matrix representing an image. The integral
image ofA, denoted byI(A), is the matrixBsuch thatBi,j=

∑

i′≤i,j′≤jAi,j.

Show thatI(A) can be calculated fromAin time linear in the size ofA.

Show how every Viola and Jones feature can be calculated fromI(A) in a
constant amount of time (that is, the runtime does not depend on the
size of the rectangle defining the feature).

Understanding Machine Learning: From Theory to Algorithms

∑

Get our desktop app

Company

Features

Documentation

Resources