Computational Methods in Systems Biology

(Ann) #1
A Scheme for Adaptive Selection of Population Sizes 133

The score functionSof a scaling factorbcwith densityKcon a sub-population
P ̃={(wi,θi)}iis


S(Kc,P ̃)=


i

wilogKc(θi).

The score function is evaluated on the sub-populationP ̃ofPwhich is not used
for density estimation. The scaling factorbc, whose corresponding densityKc
yields the highest cumulative score summed over the five folds, is subsequently
selected for the final density estimation on the complete populationP.


2.3 Population Size Adaptation


The quality of an ABC-SMC scheme is determined by the accuracy and efficiency
with which the posterior distribution is approximated. Besides the bandwidth
selection strategy, a key parameter is the population size, i.e. the number of
parameter samples of which a population consists. Not only the size of the last
population but also the sizes of the intermediate populations have substantial
influence. If intermediate populations are chosen too large, unnecessary compu-
tation is performed, rendering ABC-SMC inefficient. If intermediate populations
are chosen too small, information about the posterior might get lost which can-
not be efficiently recovered in the last population, rendering ABC-SMC inac-
curate. For example, if the true posterior is multimodal, a small intermediate
population might lack samples representing one of the posterior modes. This
mode is unlikely to be recovered in the last population, unless the population
size is chosen extraordinarily large. However, this would render ABC-SMC again
inefficient and essentially equivalent to rejection sampling. Similarly, in model
selection, one model with a small posterior probability might get completely
extinct in an intermediate population. Hence, a consistent approximation qual-
ity across all intermediate populations is important for an accurate and efficient
ABC-SMC scheme.
We thus developed an ABC-SMC scheme in which the population sizes are
adaptively selected trying to match a specified target accuracy. We propose to
express this accuracy in terms of the variation associated with kernel density
estimates on a population (smaller variation corresponding to larger accuracy).
To select the necessary population size to achieve the target variation, the effect
of increasing or decreasing the population size (Fig. 1 a) on the variation of the
density estimate for the current population is determined with bootstrapped
populations of varying sizes. By a parametric approximation to this population
size dependent variation, a population size for the next generation is selected by
interpolating to smaller population sizes if the current variation is too small and
by extrapolating to larger population sizes if the current variation is too large.
Denoting byECV the desired target density variation, we propose the fol-
lowing scheme for adaptive population size selection: The number of particles in
the initial populationt=0issetton 0. Given populationPt={(wi,θi)}ni=1t ,
t≥0, of sizent, tentative population sizesn∗t,q, evenly spaced betweennt/ 3

and 2ntwith step sizent/ 10 , are considered:

Free download pdf