21.1 The Kiefer–Wolfowitz theorem 252In the subfield of statistics called optimal experimental design, the distribution
πis called adesignand the problem of finding a design that minimizesgis
called theG-optimal design problem. So how to use this? Suppose thatπis
a design anda∈Supp(π) andna=⌈
π(a)g(π)
ε^2log(
1
δ)⌉
. (21.3)
Then choosing each actiona∈Supp(π) exactlynatimes ensures that
V=∑
a∈Supp(π)naaa>≥g(π)
ε^2log(
1
δ)
V(π),which by Eq. (21.1) means that for anya∈A, with probability 1−δ,
〈θˆ−θ∗,a〉≤√
2 ‖a‖^2 V− 1 log(
1
δ)
≤ε.By Eq. (21.3), the total number of actions required to ensure a confidence width
of no more thanεis bounded byn=∑
a∈Supp(π)na=∑
a∈Supp(π)⌈
π(a)g(π)
ε^2 log(
1
δ)⌉
≤|Supp(π)|+g(π)
ε^2 log(
1
δ)
.
The set Supp(π) is sometimes called thecore set. The following theorem
characterizes the size of the core set and the minimum ofg.
Theorem 21.1 (Kiefer–Wolfowitz). Assume that A ⊂ Rd is compact and
span(A) =Rd. The fol lowing are equivalent:
(a)π∗is a minimizer ofg.
(b)π∗is a maximizer off(π) = log detV(π).
(c)g(π∗) =d.
Furthermore, there exists a minimizerπ∗ofgsuch that|Supp(π∗)|≤d(d+ 1)/ 2.
A design that maximizesfis known as aD-optimal designand thus the
theorem establishes the equivalence of G-optimal and D-optimal designs.Proof We give the proof for finiteA. The general case follows by passing to the
limit (Exercise 21.3). When it is convenient, distributionsπonAare treated as
vectors inR|A|. You will show in Exercises 21.1 and 21.2 thatfis concave and
that(∇f(π))a=‖a‖^2 V(π)− 1. (21.4)Also notice that
∑a∈Aπ(a)‖a‖^2 V(π)− 1 = trace(
∑
aπ(a)aa>V(π)−^1)
= trace(I) =d. (21.5)