Systems Biology (Methods in Molecular Biology)

(Tina Sui) #1

5 Conclusions


System’s parameter estimation in biology asks for a continuous
feedback between biological and procedural information, the data
analysis by no way can be considered as a “separately optimized” set
of procedures to be applied to a set of experimental results. The
focus must be on the underlying (and largely unknown) network
linking the different players (in our example different gene expres-
sions) of the system at hand. This network, as such, is the only
relevant “causative agent” with the experimental observables acting
as probes of the coordinated motion of the underlying network.
This peculiar situation (Warren Weaver in a famous 1948 paper
[15] named “organized complexity”) asks for a completely differ-
ent style of reasoning with respect to the classical approach of
biologists used to a neat dependent/independent variables discrim-
ination and considering the observables as autonomous players in
the game.
Complexity can be a blessing and not a curse if we learn how to
manage it resisting to the temptation of the direct consideration of
“all the agents involved” in model construction.
The most fruitful way is letting the network to suggest us (e.g.,
by the application of unsupervised techniques like PCA) where to
look avoiding the overfitting/irrelevance traps.

References



  1. Transtrum MK et al (2015) Perspective: slop-
    piness and emergent theories in physics, biol-
    ogy and beyond. J Chem Phys 143:01091

  2. Kullback S, Leibler RA (1951) On information
    and sufficiency. Ann Math Stat 22(1):79–86

  3. Srivastava N et al (2014) Dropout: a simple
    way to prevent neural networks from overfit-
    ting. J Mach Learn Res 15(1):1929–1958

  4. Tropsha A (2010) Best practices for QSAR
    model development, validation, and exploita-
    tion. Mol Inform 29(6–7):476–488

  5. Pearson K (1901) On lines and planes of clos-
    est fit to systems of points in space. Lond Edinb
    Dubl Phil Mag J Sci 2(11):559–572

  6. Giuliani A (2017) The application of principal
    component analysis to drug discovery and bio-
    medical data. Drug Discov Today 22
    (7):1069–1076

  7. Soofi E (1994) Capturing the intangible con-
    cept of information. J Am Stat Assoc 89
    (428):1243–1254

  8. Pascual M, Levin SA (1999) From individuals
    to population densities: searching for the inter-
    mediate scale of nontrivial determinism. Ecol-
    ogy 80(7):2225–2236
    9. Broomhead DS, King GP (1986) Extracting
    qualitative dynamics from experimental data.
    Physica D 20(2–3):217–236

  9. Benigni R, Giuliani A (1994) Quantitative
    modeling and biology: the multivariate
    approach. Am J Phys Regul Integr Comp
    Phys 266(5):R1697–R1704

  10. Marwan N et al (2007) Recurrence plots for
    the analysis of complex systems. Phys Rep 438
    (5):237–329

  11. Kruskal JB (1964) Multidimensional scaling by
    optimizing goodness of fit to a nonmetric
    hypothesis. Psychometrika 29(1):1–27

  12. Anderberg MR (2014) Cluster analysis for
    applications: probability and mathematical sta-
    tistics: a series of monographs and textbooks,
    vol 19. Academic, Cambridge

  13. Simonelli V et al (2016) Crosstalk between
    mismatch repair and base excision repair in
    human gastric cancer. Oncotarget 5. 10.
    18632/oncotarget.10185

  14. Weaver W (1948) Science and complexity. Am
    Sci 36:536–549


68 Alessandro Giuliani

Free download pdf