Catalyzing Inquiry at the Interface of Computing and Biology

(nextflipdebug5) #1
362 CATALYZING INQUIRY

sists of finding solutions to abstractly formulated problems and then finding real-world problems to
which these solutions are applicable.
The engineering thread of computer science is based on finding useful and realizable solutions to
real-world problems. The space of possible solutions is usually vast and involves different architectures
and approaches to solving a given problem. Problems are generally simplified so that only the most
important aspects are addressed. Economic, human, and organizational factors are at least as important
as technological ones, and trade-offs among alternatives to decide on the “best” approach to solve a
(simplified) problem often involve art as much as science.
Biology—the study of living things—has an intellectual tradition grounded in observation and
experiment. Because biological insight has often been found in apparently insignificant information,
biologists have come to place great value on data collection and analysis. In contrast to the theoretical
computer scientist’s idea of formal proof, biologists and other life scientists rely on empirical work to
test hypotheses.
Because accommodating a large number of independent variables in an experiment is expensive, a
common experimental approach (e.g., in medicine and pharmaceuticals) is to rely on randomized
observations to eliminate or reduce the effect of variables that have not explicitly been represented in
the model underlying the experiment. Subsequent experimental work then seeks to replicate the results
of such experiments.
A biological hypothesis is regarded as “proven” or “validated” when multiple experiments indicate
that the result is highly unlikely to be due to random factors. In this context, the term “proven” is
somewhat misleading, as there is always some chance that the effect found is a random event. A
hypothesis “validated” by experimental or empirical work is one that is regarded as sufficiently reliable
as a foundation for most types of subsequent work. Generalization occurs when researchers seek to
extend the study to other conditions, or when investigation is undertaken in a new environment or with
more realism. Under these circumstances, the researcher is investigating whether the original hypoth-
esis (or some modification thereof) is more broadly applicable.
Within the biological community (indeed, for researchers in any science that relies on experiment),
repetition of an experiment is usually the only way to validate or generalize a finding, and replication
plays a central role in the conduct of biological science. By contrast, reproducing the proof of a theorem
is done by mathematicians and computer scientists mostly when a prior result is suspicious. Although
there is an honored tradition of seeking alternative proofs of theorems even if the original proof is not at
all suspicious, replication of results is not nearly as central to mathematics as it is to biology.
Finally, biology is constrained by nature, which makes rules (even if they are not known a priori to
humans), and models of biological phenomena must be consistent with the constraints that those rules
imply. By contrast, computer science is a science of the artificial—more like a game in which one can
make up one’s own rules—and the only “hard” constraints are those imposed by mathematical logic
and consistency (hence data for most computer scientists have a very different ontological role than for
biologists).


10.3.1.2 Different Approaches to Education and Training


The first introduction to computer science for many individuals involves building a computer
program. The first introduction to biology for many individuals is to watch an organism grow (remem-
ber growing seeds in Dixie cups in grade school?). These differences continue in different training
emphases for practitioners in computer science and biology in their undergraduate and graduate work.
To characterize these different emphases in broad strokes, formal training in computer science
tends to emphasize theory, abstractions, problem solving, and formalism over experimental work (in-
deed, computer programming—core to the field—is itself an abstraction). Moreover, as with many
mathematically oriented disciplines, much of the intellectual content of computer science is integrated
and, in that sense, cumulative. By contrast, data and experimental technique play a much more central

Free download pdf