Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1

236 S. Silva et al.


remarking that the models provided by M3GP are potentially much easier to
interpret than the ones provided by RF or any of the other two state-of-the-art
methods.


7.3 Results of eM3GP


The results achieved by eM3GP can be found in Tables 3 and 4. Although eM3GP
was not able to match the performance of M3GP in most problems, it appears to
be much more resistant to overfitting, based on the observed difference between
training and test fitness (Table 3 ). The obvious case is the M-L problem, where
eM3GP actually achieves significantly better test fitness, but also in problems like
WAV, SEG and YST it is clear that with eM3GP the test fitness follows the training
fitness much more closely than with M3GP. Therefore, even if the final solutions
may not necessarily be better with eM3GP, the results suggest that with more
generations the tendency may be inverted, and eM3GP may actually be able to reach
better performance than M3GP.
One point where eM3GP clearly wins is, no doubt, the compactness of the
evolved solutions, both in terms of number of nodes and in terms of number
of dimensions (Table 3 ). The ensemble approach is able to maintain smaller
solutions, preventing the bloat at the dimension level and thus performing a kind of
dimensionality reduction, at least when compared to M3GP. Also when comparing
to M2GP the solutions of eM3GP are much smaller, even in the few cases where the
number of dimensions is not.
If we inspect the distribution of dimensions inside the population we can see
there is a large difference in its evolutionary dynamics between M3GP and eM3GP.
While M3GP tends to produce unimodal distributions that approximate a Gaussian
form (see Fig. 4 ), eM3GP maintains a higher diversity of dimensions within the
population, which is either approximately flat (e.g., VOW) or has a single tail with a
peak in unidimensional transformations (e.g., M-L), as shown in Fig. 5 (compare the
VOW dynamics with the one seen in Fig. 4 for M3GP). Such a distribution, and the
effect it has on bloat, seems to correlate nicely with recently proposed bloat control
strategies (Silva 2011 ).


8 Conclusions


This work has addressed the problem of multiclass classification with GP, an area
where previous GP approaches tended to yield poor performance. It has presented
three variants of a novel method, respectively called M2GP (Ingalalli et al. 2014 ),
M3GP (Muñoz et al. 2015 ) and eM3GP. The novelty of M2GP is mainly its fitness
function, that implicitly drives the evolution into forming multidimensional clusters
that allow an accurate classification of the data. M3GP allows the evolution to

Free download pdf