Systems Biology (Methods in Molecular Biology)

(Tina Sui) #1
The material common to all the discrete approaches is the raw
data matrix: each exploration of nature ends up into a matrix having
as rows (statistical units) the objects of the analysis and as columns
(variables) the descriptors of such objects.
This imposes the crucial choices of the “inclusion criteria” and
“preferred scale” of analysis. This kind of problems are identical
for an epidemiologist that must choose the inclusion criteria for the
individuals entering a case-control or a double-blind trial, and for a
biologist that must choose the cell populations to submit to a
microarray study or the hierarchical level of an ecological study
(species, genera, etc.).
In the same time, the scientists must be aware that the success
of their analyses crucially depends upon the choice of the variables
to consider both in terms of “what to measure” and “at what scale”
[9]. In nonlinear time series analysis methods like Recurrence
Quantification Analysis (RQA,11) this translates into the choice
of “embedding dimension,” “windowing,” “choice of metrics,”
“choice of recurrence thresholds, etc.,” in other techniques like
PCA or Multidimensional Scaling (MDS,12) these choices corre-
spond to the definition of data set, standardization, metrics. Similar
considerations hold for cluster analysis techniques [13] regarding
the number of clusters and/or the choice between a hierarchical or
non-hierarchical approach.
The need for these subjective choices is normally seen as a
limitation or, in any case, a lack of rigor when, on the contrary, it
is a crucial advantage with respect to more “blind to the content”
methods because it allows a rich and fruitful relation between the
“analysis tool” and the studied system.

4 A Molecular Biology Example


In order to understand the nature of such a relation between
“subjective judgment” (based on content knowledge) and data
analysis (based on procedural knowledge), we will base upon an
investigation on the molecular mechanism of DNA repair in gastric
cancer patients [14].
The hypothesis under scrutiny is the existence of an inverse
correlation between mismatch repair (MMR) mode (“marked” by
the expression level of gene MLH1) and base excision repair (BER)
(marked by the expression level of DNA polymerase B (PolB)
gene). To gain insight into possible crosstalk of these two repair
pathways in cancer, we analyzed human gastric adenocarcinoma
AGS in the presence of a DNA damage agent (Methyl Methane
Sulphonate MMS).
DNA repair is a process involving different enzymes whose
complex relations ends up into different DNA repair efficiencies
and consequently mutation load for the cells.

62 Alessandro Giuliani

Free download pdf