216 Optimizing Optimization
(for instance, choose it randomly), we are not guaranteed any more to end up
in the same solution: our solution becomes stochastic, despite having used a
deterministic method.
To escape local minima, many heuristic optimization methods make delib-
erate use of randomness. In TA , for example, new neighbor portfolios are
selected randomly. Thus, even for identical starting points (but different seeds
for the random number generator that is used), repeated runs of TA will pro-
duce different solutions — in fact, even in convex problems. In the following
discussion, we will characterize a solution by its associated objective function
value, though sometimes it may also be appropriate to look at the parameter
values. For a given problem, these solutions can be regarded as realizations of
a random variable with an unknown distribution F ( Gilli & Winker, 2009 ).
The distribution F is not symmetric, but bounded to the left (since we mini-
mize) and degenerates for increasing computational resources to a single point,
namely the global minimum. A proof of this convergence for TA is given in
Alth ö fer and Koschnick (1991). (Of course, if our problem is not bounded,
then neither is F .) The particular shape of F will depend on the chosen
algorithm, and on the amount of computational resources spent on the opti-
mization. In a serial computing environment, this amount of computational
resources is usually roughly proportional to computing time (see Gilli and
Schumann (2008) for the case of parallel computing).
For a single optimization run, computational resources can be measured by
the number of objective function evaluations, or the total number of iterations
of the algorithm. For TA , the number of iterations is given by the number of
thresholds times the number of steps per threshold, i.e., n (^) Rounds n (^) Steps (note
that we use “ steps ” always for “ steps per threshold, ” whereas “ iterations ”
refer to the product rounds times steps). Thus, one way to increase computa-
tional resources is to give more iterations.
There is a second way to increase computational resources: just rerun the
optimization, i.e., conduct so-called restarts. To give an analogy: assume you
have a die, and you wish to roll a 6. The die here is the optimization algorithm,
and the “ 6 ” represents a good solution. Now, one approach is to manipulate
the die (that would be cheating, but here it is only an analogy), so that the
probability of rolling a 6 increases. This is equivalent to increasing the number
of iterations. You could also roll the die several times (all you need is one 6,
i.e., one good solution), i.e., you could conduct restarts.
Let (^) I denote available computational resources, then we can split these
resources between restarts, numbers of thresholds, and numbers of steps per
threshold, i.e.:
InnRestarts Rounds nSteps
of iterations
The shape of F will depend on the number of iterations, but also on how
this number is subdivided into rounds and steps. Every restart is a draw from F.