Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1

14 M. Kommenda et al.


Ta b l e 5 Size statistics of the best models for Problem-2 per algorithm variant
Original model Simplified model
Problem-2 Length Depth Length Depth Equation
GP Length 20 18 7 10 4 Eq. ( 6 )
GP Length 50 39 11 20 6 Eq. ( 8 )
GP Length 100 64 21 54 15 Eq. ( 9 )
NSGA-II Complexity 16 7 10 4 Eq. ( 6 )
NSGA-II Visitation Length 14 6 10 4 Eq. ( 7 )
NSGA-II Tree Size 16 7 10 4 Eq. ( 6 )
NSGA-II Variables 25 9 10 4 Eq. ( 6 )
The length and depth of the symbolic expression trees are displayed for their
original and simplified version stated in Eqs. ( 6 )–( 9 )

Therefore, all extracted models explain the relation between the input and output
data accurately and there is no difference between the models in terms of prediction
quality. GP with a length limit of 20 and NSGA-II with the complexity, tree size
and variable measure found exactly the data generating formulaf 1 (Eq. ( 6 )), whereas
NSGA-II with the visitation length found an alternative formulationf 2 (Eq. ( 7 )). On
the contrary, GP with higher length limits of 50 (f 3 ,Eq.( 8 )) and 100 (f 4 ,Eq.( 9 ))
respectively, found models that include additional terms which cannot be removed
by constant folding although their impact on the evaluation is minimal.


f 1 .x/Dx 2 Cx 3 x 4 Cx 4 x 5 Csin.x 1 / (6)
f 2 .x/Dx 2 Cx 4 .x 3 Cx 5 /Csin.x 1 / (7)

f 3 .x/Dx 2 Cx 3 x 4 Cx 4 x 5 Csin.x 1 /Œ5:11 10^10 x 5 =x 1 C1 (8)

f 4 .x/Dx 2 Cx 3 x 4 Cx 4 x 5 Csin.x 1 /C8:7 10^7 cos.cos.sin.0:99x 1 /Cesin.x^1 ///
(9)
C8:85 10^7 cos.sin.0:79sin.x 1 /Csin.ecos.1:61Ce

sin.x 1 //
/

cos.cos.sin^2 .cos.cos.tan.sin.0:99x 1 //Ccos.sin.0:99x 1 /////////

The size statistics of the extracted models in their original and simplified version
are displayed in Table 5. All models get significantly smaller during the constant
folding and simplification operations performed. The models created by GP with a
length limit of 20 and NSGA-II found the data generating formula directly (except
NSGA-II with the variables complexity measure) and the size reduction during
simplification is caused by the transformation of binary trees to n-ary trees. The best
model created by NSGA-II variables contained in its original form one additional
subtree expressing a constant numerical value that is removed by constant folding.
The two GP variants with larger length limits failed to find the data generating
formula due to the inclusion of complex subtrees with almost no evaluation effect.

Free download pdf