COMPUTATIONAL TOOLS 61
- Microarray data related to a given cell may be taken by multiple investigators in different labo-
ratories. - Ecological data (e.g., temperature, reflectivity) in a given ecosystem may be taken by different
instruments looking at the system. - Neurological data (e.g., timing and amplitudes of various pulse trains) related to a specific
cognitive phenomenon may be taken on different individuals in different laboratories.
The simplest example of the normalization problem is when different instruments are calibrated
differently (e.g., a scale in George’s laboratory may not have been zeroed properly, rendering mass
measurements from George’s laboratory noncomparable to those from Mary’s laboratory). If a large
number of readings have been taken with George’s scale, one possible fix (i.e., one possible normaliza-
tion) is to determine the extent of the zeroing required and to add or subtract that correction to the
already existing data. Of course, this particular procedure assumes that the necessary zeroing was
constant for each of George’s measurements. The procedure is not valid if the zeroing knob was jiggled
accidentally after half of the measurements had been taken.
Such biases in the data are systematic. In principle, the steps necessary to deal with systematic bias
are straightforward. The researcher must avoid it as much as possible. Because complete avoidance is
not possible, the researcher must recognize it when it occurs and then take steps to correct for it.
Correcting for bias entails determining the magnitude and effect of the bias on data that have been taken
and identifying the source of the bias so that the data already taken can be modified and corrected
appropriately. In some cases, the bias may be uncorrectable, and the data must be discarded.
However, in practice, dealing with systematic bias is not nearly so straightforward. Ball notes that
in the real world, the process goes something like this:
- Notice something odd with data.
- Try a few methods to determine magnitude.
- Think of many possible sources of bias.
- Wonder what in the world to do next.
There are many sources of systematic bias, and they differ depending on the nature of the data
involved. They may include effects due to instrumentation, sample (e.g., sample preparation, sample
choice), or environment (e.g., ambient vibration, current leakage, temperature). Section 3.3 describes a
number of the systematic biases possible in microarray data, as do several references provided by Ball.^5
There are many ways to correct for systematic bias, depending on the type of data being corrected.
In the case of microarray studies, these ways include use of dye swap strategies, replicates and reference
samples, experimental controls, consistent techniques, and sensible array and experiment design. Yet all
(^5) Ball’s AAAS presentation includes the following sources: T.B. Kepler, L. Crosby, and K.T. Morgan, “Normalization and
Analysis of DNA Microarray Data by Self-consistency and Local Regression,” Genome Biololgy 3(7), RESEARCH0037.1- RE-
SEARCH0037.12, 2002. Available at http://genomebiology.org/2002/3/7/research/0037.1; R. Hoffmann, T. Seidl, M. Dugas.
“Profound Effect of Normalization on Detection of Differentially Expressed Genes in Oligonucleotide Microarray Data Analy-
sis,” Genome Biolology 3(7):RESEARCH0033.1-RESEARCH0033.1-11. Available at http://genomebiology.com/2002/3/7/re-
search/0033; C. Colantuoni, G. Henry, S. Zeger, and J. Pevsner, “Local Mean Normalization of Microarray Element Signal
Intensities Across an Array Surface: Quality Control and Correction of Spatially Systematic Artifacts,” Biotechniques 32(6):1316-
1320, 2002; B.P. Durbin, J.S. Hardin, D.M. Hawkins, and D.M. Rocke, “A Variance-Stabilizing Transformation for Gene-Expres-
sion Microarray Data,” Bioinformatics 18 (Suppl. 1):S105-S110, 2002; P.H. Tran, D.A. Peiffer, Y. Shin, L.M. Meek, J.P. Brody, and
K.W. Cho, “Microarray Optimizations: Increasing Spot Accuracy and Automated Identification of True Microarray Signals,”
Nucleic Acids Research 30(12):e54, 2002, available at http://nar.oupjournals.org/cgi/content/full/30/12/e54; M. Bilban, L.K.
Buehler, S. Head, G. Desoye, and V. Quaranta, “Normalizing DNA Microarray Data,” Current Issues in Molecular Biology 4(2):57-
64, 2002; J. Quackenbush, “Microarray Data Normalization and Transformation,” Nature Genetics Supplement 32:496-501, 2002.