Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1

6 M. Kommenda et al.


ab

Fig. 1 Comparison of the development of symbolic expression tree length over generations for
standard and adapted NSGA-II. The population quickly converges to extremely small trees in the
case of the standard implementation of NSGA-II which renders this variant ineffective for symbolic
regression. (a) Standard NSGA-II. (b) Adapted NSGA-II


expression tree length is visualized over generations of the algorithm. On the left
side, the behavior of the standard NSGA-II is displayed and it can be seen that
the whole population collapses to a few different solutions within the first ten
generations. On the right side, the behavior of the adapted NSGA-II is displayed
and although the trees get smaller, more diversity is preserved and the algorithm is
able to learn from the presented data.


3.2 Discrete Objective Functions


Another aspect when performing symbolic regression is that one of the objective
functions describes the fit of the model’s output to the presented data, which is
in general more important than the simplicity of the models. Frequently, the mean
squared error (or a variation thereof) or another correlation criterion such as the
Pearson’sR^2 correlation are used as an objective function. An issue determined by
the floating-point representation of fitness values can arise when many individuals
of similar quality (up to many decimal places) and varying complexity artificially
enlarge the Pareto front.
A possibility to avoid this issue is to discretize the objective function by rounding
the objective value to a fixed number of decimal places. The objective function we
used to describe the model accuracy is the Pearson’sR^2 correlation of the observed
yand the predicted valuesy^0. We round the Pearson’sR^2 to three decimal places

Free download pdf