the transformed values. We did something similar in Chapter 9 with the Symptom score in
the study of stress.
Most people find it difficult to accept the idea of transforming data. It somehow seems
dishonest to decide that you do not like the data you have and therefore to change them
into data you like better or, even worse, to throw out some of them and pretend they were
never collected. When you think about it, however, there is really nothing unusual about
transforming data. We frequently transform data. We sometimes measure the timeit takes
a rat to run down an alley, but then look for group differences in running speed, which is
the reciprocal of time (a nonlinear transformation). We measure sound in terms of physi-
cal energy, but then report it in terms of decibels, which represents a logarithmic transfor-
mation. We ask a subject to adjust the size of a test stimulus to match the size of a
comparison stimulus, and then take the radius of the test patch setting as our dependent
variable—but the radiusis a function of the square root of the areaof the patch, and we
could just as legitimately use area as our dependent variable. On some tests, we calculate
the number of items that a student answered correctly, but then report scores in percentiles—
a decidedly nonlinear transformation. Who is to say that speed is a “better” measure than
time, that decibels are better than energy levels, that radius is better than area, or that a
percentile is better than the number correct? Consider a study by Conti and Musty (1984)
on the effects of THC (the most psychoactive ingredient in marijuana) on locomotor ac-
tivity in rats. Conti and Musty measured activity by reading the motion of the cage from a
transducer that represented that motion in voltage terms. In what way could their electri-
cally transduced measure of test-chamber vibration be called the “natural” measure of ac-
tivity? More important, they took postinjection activity as a percentage of preinjection
activity as their dependent variable, but would you leap out of your chair and cry “Foul!”
because they had used a transformation? Of course you wouldn’t—but it was a transfor-
mation nonetheless.
As pointed out earlier in this book, our dependent variables are only convenient and
imperfect indicators of the underlying variables we wish to study. No sensible experi-
menter ever started out with the serious intention of studying, for example, the “number of
stressful life events” that a subject reports. The real purpose of such experiments has
always been to study stress, and the number of reported events is merely a convenient
measure of stress. In fact, stress probably does not vary in a linear fashion with number of
events. It is quite possible that it varies exponentially—you can take a few stressful events
in stride, but once you have a few on your plate, additional ones start having greater and
greater effects. If this is true, the number of events raised to some power—for example,
—might be a more appropriate variable.
The point of this fairly extended, but necessary, digression is to encourage flexibility.
You should not place blind faith in your original numbers; you must be willing to consider
possible transformations. Tukey probably had the right idea when he called these calcula-
tions “reexpressions” rather than “transformations.” You are merely reexpressing what the
data have to say in other terms.
Having said that, it is important to recognize that conclusions that you draw on trans-
formed data do not always transfer neatly to the original measurements. Grissom (2000)
reports on the fact that the means of transformed variables can occasionally reverse the dif-
ference of means of the original variables. This is disturbing, and it is important to think
about the meaning of what you are doing, but that is not, in itself, a reason to rule out the
use of transformations.
If you are willing to accept that it is permissible to transform one set of measures into
another—for example, or —then many possibilities become avail-
able for modifying our data to fit more closely the underlying assumptions of our statistical
tests. The nice thing about most of these transformations is that when we transform the data
to meet one assumption, we often come closer to meeting other assumptions as well. Thus,
Yi=log(Xi) Yi= 2 Xi
Y=(number of events)^2
Section 11.9 Transformations 337