Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1

Multiclass Classification Through Multidimensional Clustering 237


VOW M-L

0 5 10 15 20 25
0

50

100

150

200

250

300

350

400

450

500

Dimensions

Frequency

0 5 10 15 20 25
0

50

100

150

200

250

300

350

400

450

500

Dimensions

Frequency

Fig. 5 Distribution of the number of dimensions in the final generation for eM3GP. On theleft,a
typical run of problem VOW. On theright, a typical run of problem M-L


choose the most appropriate dimensionality for this mapping during the search,
thus improving the adaptability to the particularities and difficulties posed by each
problem. Finally, the newer eM3GP introduces an ensemble approach that allows
the evolution of specialized mappings for different classes, thus providing some
protection against overfitting and the negative effects of class imbalance. As a
welcome side effect, eM3GP also removes the bloating problems that seems to affect
M3GP.
The results have shown that this new approach finally allows GP to be considered
as a viable and competitive option for solving multiclass classification problems,
even when compared to the best and most popular state-of-the-art classifiers, like
Random Forests, Random Subspaces and Multilayer Perceptron.
Future work will focus on the difficulties of real-world problems. The apparent
ability of eM3GP for dealing with overfitting and class imbalance will be thoroughly
tested, and certainly improved. We will also go back to the original fitness function
of M2GP and improve this core element of success, as it is still in its original β€œraw”
form and its robustness can certainly be improved in order to face the difficulties
of real-world data. Another path of future work is the interpretation of the solutions
returned by this method. Until now there was absolutely no attempt at performing
a symbolic simplification of the mappings returned, or any type of interpretation of
what these mappings may reveal about the data.
For now, it is clear that with this new approach we have a simple and general
purpose classifier that is well worth testing, improving and using in challenging
classification tasks.


AcknowledgementsThis work was partially supported by FCT funds (Portugal) under con-
tract UID/Multi/04046/2013 and projects PTDC/EEI-CTP/2975/2012 (MaSSGP), PTDC/DTP-
FTO/1747/2012 (InteleGen) and EXPL/EMS-SIS/1954/2013 (CancerSys). Funding was also
provided by CONACYT (Mexico) Basic Science Research Project No. 178323, DGEST (Mexico)
Research Projects No. 5149.13-P and 5414.11-P, and FP7-Marie Curie-IRSES 2013 project
ACoBSEC. Finally, the second author is supported by scholarship No. 372126 from CONACYT.

Free download pdf