Catalyzing Inquiry at the Interface of Computing and Biology

(nextflipdebug5) #1
COMPUTATIONAL TOOLS 97

Finally, in early 2005, National Geographic and IBM announced a collaboration known as the the
Genographic Project to probe the migratory history of the human species.^119 The project seeks to collect 100,000
blood samples from indigenous populations, with the intent of analyzing DNA in these samples. Ultimately,
the project will create a global database of human genetic variation and associated anthropological data
(language, social customs, etc.) that provides a snapshot of human genetic variation before the cultural context
of indigenous populations is lost—a context that is needed to make sense of the variations in DNA data.

4.4.7 Analysis of Gene Expression Data,


Although almost all cells in an organism contain the same genetic material (the genomic blueprint for
the entire organism), only about one-third of a given cell’s genes are expressed or “switched on”—that is,
are producing proteins—at a given time. Expressed genes account for differences in cell types; for ex-
ample, DNA in skin cells produces a different set of proteins than DNA in nerve cells. Similarly, a
developing embryo undergoes rapid changes in the expression of its genes as its body structure unfolds.
Differential expression in the same types of cells can represent different cellular “phenotypes” (e.g.,
normal versus diseased), and modifying a cell’s environment can result in changed levels of expression of
a cell’s genes. In fact, the ability to perturb a cell and observe the consequential changes in expression is a
key to understanding linkages between genes and can be used to model cell signaling pathways.
A powerful technology for monitoring the activity of all the genes in a cell is the DNA microarray
(described in Box 7.5 in Chapter 7). Many different biological questions can be asked with microarrays,
and arrays are now constructed in many varieties. For example, instead of DNA across an entire
genome, the array might be spotted with a specific set of genes from an organism or with fabricated
sequences of DNA (oligonucleotides) that might represent, for example, a particular SNP or a mutated
form of a gene. More recently, protein arrays have been developed as a new tool that extends the reach
of gene expression analysis.
The ability to collect and analyze massive sets of data about the transcriptional states of cells is an
emerging focus of molecular diagnostics as well as drug discovery. Profiling the activation or suppres-
sion of genes within cells and tissues provides telling snapshots of function. Such information is critical
not only to understand disease progression, but also to determine potential routes for disease interven-
tion. New technologies that are driving the field include the creation of “designer” transcription factors
to modulate expression, use of laser microdissection methods for isolation of specific cell populations,
and technologies for capturing mRNA. Among the questions asked of microarrays (and the computa-
tional algorithms to decipher the results) are the discrimination of genes with significant changes in
expression relative to the presence of a disease, drug regimen, or chemical or hormonal exposure.
To illustrate the power of large-scale analysis of gene data, an article in Science by Gaudet and
Mango is instructive.^120 A comparison of microarray data taken from Caenorhabditis elegans embryos
lacking a pharynx with microarray data from embryos having excess pharyngeal tissue identified 240
genes that were preferentially expressed in the pharynx, and further identified a single gene as directly
regulating almost all of the pharynx-specific genes that were examined in detail. These results suggest
the possibility that direct transcriptional regulation of entire gene networks may be a common feature of
organ-specification genes.^121

(^119) More information on the project can be found at http://www5.nationalgeographic.com/genographic/.
(^120) J. Gaudet and S.E. Mango, “Regulation of Organogenesis by the Caenorhabditis elegans FoxA Protein PHA-4,” Science
295(5556):821-825, 2002.
(^121) For example, it is known that a specific gene activates other genes that function at two distinct steps of the regulatory
hierarchy leading to wing formation in Drosophila (K.A. Guss, C.E. Nelson, A. Hudson, M. E. Kraus and S. B. Carroll, “Control of
a Genetic Regulatory Network by a Selector Gene,” Science 292(5519):1164-1167, 2001), and also that the presence of specific
factor is both necessary and sufficient for specification of eye formation in Drosophila imaginal discs, where it directly activates
the expression of both early- and late-acting genes (W.J. Gehring and K. Ikeo, “Pax 6: Mastering Eye Morphogenesis and Evolu-
tion,” Trends in Genetics 15(9):371-377, 1999).

Free download pdf