Genetic_Programming_Theory_and_Practice_XIII

(C. Jardin) #1
nPool: Massively Distributed Simultaneous

Evolution and Cross-Validation in EC-Star

Babak Hodjat and Hormoz Shahrzad


Abstract We introduce a cross-validation algorithm called nPool that can be
applied in a distributed fashion. Unlike classic k-fold cross-validation, the data
segments are mutually exclusive, and training takes place only on one segment. This
system is well suited to run in concert with the EC-Star distributed Evolutionary
system, cross-validating solution candidates during a run. The system is tested with
different numbers of validation segments using a real-world problem of classifying
ICU blood-pressure time series.


Keywords Evolutionary computation • Distributed processing • Machine learn-
ing • Cross-validation


1 Introduction


The Age-Varying fitness approach is suitable for data problems in which evolved
solutions need to be applied to many fitness samples in order to measure a
candidate’s fitness (see Hodjat and Shahrzad 2013 ). This is an elitist approach: best
candidates of each generation are retained to be run on more fitness cases to improve
our confidence in the candidate’s fitness. The number of fitness evaluations in this
method depends on the relative fitness of a candidate solution compared to others at
any given point.
EC-Star (see O’Reilly et al. 2013 ) is a massively distributed evolutionary
platform that uses age-varying fitness as the basis for distribution, thus allowing
for easier distribution of large data problems through sampling or hashing/feature-
reduction techniques, breaking the data stash into smaller chunks, each contributing
to the overall evaluation of the candidates.
In this system, age is defined as the number of fitness samples a candidate has
been evaluated upon. EC-Star uses a hub and spoke architecture for distribution,
where the main evolutionary process is moved to the processing nodes (see Fig. 1 ).
Each node, or Evolution Engine, has its own pool and independently runs through


B. Hodjat () • H. Shahrzad
Sentient Technologies, 1 California St. #2300, San Francisco, CA, USA
e-mail:[email protected];[email protected]


© Springer International Publishing Switzerland 2016
R. Riolo et al. (eds.),Genetic Programming Theory and Practice XIII,
Genetic and Evolutionary Computation, DOI 10.1007/978-3-319-34223-8_5


79
Free download pdf