Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1

Multiclass Classification Through Multidimensional Clustering 229


Ta b l e 1 Data sets used for the experimental analysis
Data set HRT IM-3 WAV SEG IM-10 YST VOW M-L
No. of classes 2 3 3 7 10 10 11 15
No. of attributes 13 6 40 19 6 8 13 90
No. of samples 270 322 5000 2310 6798 1484 990 360

6.1 Data Sets


We have used eight different data sets to test the performance of the three methods.
Table 1 summarizes the main characteristics of these data sets that encompass both
real-world and synthetic data, having integer and real data types, with varying
number of attributes, classes and samples. The ‘Heart’ (HRT), ‘Segment’ (SEG),
‘Vowel’ (VOW), ‘Yeast’ (YST) and ‘movement-libras’ (M-L) data sets can be
found in the KEEL dataset repository^1 in Alcala-Fdez et al. ( 2011 ), whereas the
‘Waveform’ (WAV) data set is available in Bache and Lichman ( 2013 ). ‘IM-3’ and
‘IM-10’ are the landsat satellite data sets that were used in Ingalalli et al. ( 2014 )
and Muñoz et al. ( 2015 ), taken from data available on the U.S. Geological Survey
(USGS) Earth Resources Observation Systems (EROS) Data Center (EDC).^2 None
of the eight data sets have missing values. From each of the original datasets we
have formed 30 different partitions with the training and test data ratio of 70:30, to
be used in 30 independent runs.


6.2 Tools and Parameters


A modified version of GPLAB 3 was used to execute all the runs. GPLAB is a freely
available open source GP toolbox for MATLAB.^3 Most of the settings adopted
were the GPLAB 3 defaults. The population size was 500 individuals, allowed to
evolve for 100 generations in 30 independent runs per experiment. The function set
includedC,,and=(protected as in Koza 1992 ) and the terminal set included
ephemeral random constants (also as in Koza 1992 ). Due to the implementation
particularities and differences between M2GP and M3GP, some relevant settings
were modified accordingly, as already described in Sect. 4. For additional details
on other settings, the reader is referred to Ingalalli et al. ( 2014 ) and Muñoz et al.
( 2015 ).


(^1) http://keel.es/datasets.php
(^2) http://glovis.usgs.gov
(^3) http://gplab.sourceforge.net

Free download pdf