Catalyzing Inquiry at the Interface of Computing and Biology

(nextflipdebug5) #1
108 CATALYZING INQUIRY

been a source of qualitative insight.^140 While this is still true, there is growing interest in using image
data more quantitatively.
Consider the following applications:



  • Automated identification of fungal spores in microscopic digital images and automated estima-
    tion of spore density;^141

  • Automated analysis of liver MRI images from patients with putative hemochromatosis to deter-
    mine the extent of iron overload, avoiding the need for an uncomfortable liver biopsy;^142


Box 4.6
An Information-intensive Approach to Cancer Drug Discovery

Given one compound as a “seed,” [an algorithm known as] COMPARE searches the database of screened agents for
those most similar to the seed in their patterns of activity against the panel of 60 cell lines. Similarity in pattern often
indicates similarity in mechanism of action, mode of resistance, and molecular structure....

A formulation of this approach in terms of three databases [includes databases for] the activity patterns [A],...
molecular structural features of the tested compounds [S], and... possible targets or modulators of activity in the
cells [T].... The (S) database can be coded in terms of any set of two-dimensional (2D) or 3D molecular structure
descriptors. The NCI’s Drug Information System (DIS) contains chemical connectivity tables for approximately
460,000 molecules, including the 60,000 tested to date. 3-D structures have been obtained for 97% of the DIS
compounds, and a set of 588 bitwise descriptors has been calculated for each structure by use of the Chem-X
computational chemistry package. This data set provides the basis for pharmacophoric searches; if a tested com-
pound, or set of compounds, is found to have an interesting pattern of activity, its structure can be used to search for
similar molecules in the DIS database.

In the target (T) database, each row defines the pattern (across 60 cell lines) of a measured cell characteristic that may
mediate, modulate, or otherwise correlate with the activity of a tested compound. When the term is used in this general
shorthand sense, a “target” may be the site of action or part of a pathway involved in a cellular response. Among the
potential targets assessed to date are oncogenes, tumor-suppressor genes, drug resistance-mediating transporters, heat
shock proteins, telomerase, cytokine receptors, molecules of the cell cycle and apoptotic pathways, DNA repair en-
zymes, components of the cytoarchitecture, intracellular signaling molecules, and metabolic enzymes.

In addition to the targets assessed one at a time, others have been measured en masse as part of a protein expression
database generated for the 60 cell lines by 2D polyacrylamide gel electrophoresis.

Each compound displays a unique “fingerprint” pattern, defined by a point in the 60D space (one dimension for each
cell line) of possible patterns. In information theoretic terms, the transmission capacity of this communication chan-
nel is very large, even after one allows for experimental noise and for biological realities that constrain the com-
pounds to particular regions of the 60D space. Although the activity data have been accumulated over a 6-year
period, the experiments have been reproducible enough to generate... patterns of coherence.

SOURCE: Reprinted by permission from J.N. Weinstein, T.G. Myers, P.M. O’Connor, S.H. Friend, A.J. Fornace, Jr., K.W. Kohn, T. Fojo, et
al., “An Information-intensive Approach to the Molecular Pharmacology of Cancer,” Science 275(5298):343-349, 1997. Copyright 1997
AAAS.

(^140) Note also that biological imaging itself is a subset of the intersection between biology and visual techniques. In particular, other
biological insight can be found in techniques that consider spectral information, e.g., intensity as a function of frequency and perhaps a
function of time. Processing microarray data (discussed further in Section 7.2.1) ultimately depends on the ability to extract interesting
signals from patterns of fluorescing dots, as does quantitative comparison of patterns obtained in two-dimensional polyacrylamide gel
electrophoresis. (See S. Veeser, M.J. Dunn, and G.Z. Yang, “Multiresolution Image Registration for Two-dimensional Gel Electrophore-
sis,” Proteomics 1(7):856-870, 2001, available at http://vip.doc.ic.ac.uk/2d-gel/2D-gel-final-revision.pdf.))
(^141) T. Bernier and J.A. Landry, “Algorithmic Recognition of Biological Objects,” Canadian Agricultural Engineering 42(2):101-109, 2000.
(^142) George Reeke, Rockefeller University, personal communication to John Wooley, October 8, 2004.

Free download pdf