Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1

4 M. Kommenda et al.


containing lots of constants. As a result the constant symbol could gain prevalence
in the trees of the population and the algorithm would primarily build constant
expressions of varying size. The same reasoning applies to theC 1 term (Eq. ( 4 )
line 4) in the case of multiplication and division. 1 is the neutral element to
multiplication and therefore the algorithm would build deeply nested tree containing
lots of multiplications/divisions with constants and the learning abilities of the
algorithm are worsened.
Definitions of complexity measures for symbolic regression:


Length.T/D

X

s (^2) sT
1 (1)
VisitationLength.T/D
X
s (^2) sT
Length.s/ (2)
Variables.T/D
X
s (^2) sT
(
1 ifsym.s/Dvariable
0 otherwise
(3)
s (^2) sTdefines the subtree relation and returns all subtreessof treeT
sym.s/returns the symbol of the root node of trees
Complexity.n/D
8
ˆˆ
ˆˆ
ˆˆ
ˆˆ
ˆˆ
ˆˆ
ˆˆ
ˆ<
ˆˆ
ˆˆ
ˆˆˆ
ˆˆ
ˆˆ
ˆˆ
ˆˆ
:
1 ifsym.n/Dconstant
2 ifsym.n/Dvariable
P
c (^2) cnComplexity.c/ ifsym.n/^2 .C;/
Q
c (^2) cnComplexity.c/C^1 ifsym.n/^2 .;=/
Complexity.n 1 /^2 ifsym.n/Dsquare
Complexity.n 1 /^3 ifsym.n/Dsquareroot
2 Complexity.n^1 / ifsym.n/ 2 .sin;cos;tan/
2 Complexity.n^1 / ifsym.n/ 2 .exp;log/
(4)
c (^2) cndefines the child relation and returns all direct child nodescof noden
indexing is used to refer to the i-th child of a node, i.e.n 1 refers to the first child node of noden
sym.n/returns the symbol of noden
3 NSGA-II for Symbolic Regression
Multi-objective symbolic regression has previously been studied by Smits and
Kotanchek ( 2005 ) and Vladislavleva et al. ( 2009 ), where a novel algorithm called
Pa re t o G Phas been used. ParetoGP optimizes the accuracy of the models (in terms of

Free download pdf