ILLUSTRATIVE PROBLEM DOMAINS AT THE INTERFACE OF COMPUTING AND BIOLOGY 319
disorders are highly penetrant, but occur relatively rarely in the population (with approximate inci-
dences of one in several hundred or less).
On the other hand, multifactor genetic causality for disease is almost certainly much more com-
mon than monogenic causality. In principle, knowledge of the range of genetic variations across many
loci in the population will allow researchers to estimate risks arising from the combined effect of such
variations.
Using breast cancer as a case study, Pharoah et al. (footnote 48) compared the potential for predic-
tion of risk based on common genetic variations with the predictions that could be made using known
and established risk factors. They concluded that a typical polygenic approach for analysis would
suggest that the half of the population at highest risk would account for 88 percent of all affected
individuals, if all of the susceptibility genes could be identified. However, using currently known
factors for breast cancer to stratify the population, they estimated that the half of the population at
highest risk would account for only 62 percent of all cases. Pharoah et al. thus suggest that genetic
profiles may provide significant improvement in the ability to differentiate at-risk individuals from
individuals not at risk.
Nevertheless, for a variety of reasons, identifying the relevant genetic signatures over multiple
genes that account for disease susceptibility will pose significant intellectual challenges. Probably the
most important point is that the contribution of any given gene involved is likely to be weak; hence
detecting its clinical significance may be problematic. Nongenomic effects, such as posttranslational
modifications, may also be relevant. Zimmern^50 notes that even monogenic conditions can result in
variable expressivity and incomplete penetrance, and that similar disease phenotypes may result from
genetic heterogeneity, whether in the form of allelic heterogeneity (different mutations at the same
locus) or locus heterogeneity (where mutations occur at different loci). Different mutations of the same
gene may also give rise to separate clinical effects. Environmental factors may be difficult to disentangle
from genetic ones. As a consequence of such issues, definitive conclusions about the relationship of a
given polygenic genotype to a specific disease condition may well be difficult to draw.
An extension of the genomic approach to disease susceptibility applies to understanding the impact
of an individual’s genomic composition on that individual’s response to various environmental insults
to the body, such as those caused by exposure to chemicals (e.g., from drinking water or air pollution)
or electromagnetic fields (e.g., from cell phones or ambient radiation). Furthermore, in dealing with
certain environmental insults, stochasticity is likely to play an important role. For example, in consider-
ing the effects of radiation on the genome, macroscopic parameters that characterize radiation such as
duration and intensity are insufficient to determine its effect, simply because what part of a genome is
affected is mostly a matter of chance. Thus, a given dose of a certain kind of radiation will not affect
individuals in equal measure and, more to the point, could not be expected to affect even an ensemble
of identical twins similarly.
Overall, there is wide variability in individual responses to environmental influences. While exist-
ing diseases, differences in gender, or differences in nutritional status affect such variability, genetic
influences are also important. Genes that affect the human response to environmental exposure (called
environmentally responsive genes by the Environmental Genome Project [EGP] of the National Insti-
tute of Environmental Health Sciences [NIEHS] tend to fall into several categories.^51 That is, they affect
the cell cycle, DNA repair, cell division, cell signaling, cell structure, gene expression, apoptosis, and
metabolism. The initial phases of the EGP are focused on identifying single nucleotide polymorphisms
(SNPs) associated with 554 genes identified by the scientific community as environmentally responsive.
Identification of the SNPs associated with environmentally responsive genes would make it possible to
conduct epidemiological studies that classify subjects by SNPs, thus increasing the utility of these
(^50) R.L. Zimmern, “The Human Genome Project: A False Dawn?” British Medical Journal 319(7220):1282, 1999.
(^51) See http://www.niehs.nih.gov/envgenom/egp.htm.