Statistical Methods for Psychology

(Michael S) #1
a square root transformation not only may help us equate group variances but, because it
compresses the upper end of a distribution more than it compresses the lower end, it may
also have the effect of making positively skewed distributions more nearly normal in shape.
A word is in order about reporting transformed data. Although it is legitimate and
proper to run a statistical test, such as the analysis of variance, on the transformed values,
we often report means in the units of the untransformed scale. This is especially true when
the original units are intrinsically meaningful. We would, however, need to inform our
reader that the analysis was carried out on transformed data.
One example is the salaries of baseball players from different teams. People who work
with salary figures routinely perform their analyses on log(salary). However, log(salary) is
not a meaningful measure to most of us. A better approach would be to convert all data to
logs (assuming you have chosen to use a logarithmic transformation), find the mean of
those log values, and then take the antilog to convert that mean back to the original units.
This converted mean almost certainly will not equal the mean of the original values, but it
is this converted mean that should be reported. But I would urge you to look at both the
converted and unconverted means and make sure that they are telling the same basic story.
Do not convert standard deviations—you will do serious injustice if you try that. And be
sure to indicate to your readers what you have done.
In this chapter we will consider only the most common transformations, because they
are the ones that will be most useful to you. Excellent discussions of the whole approach to
transformations can be found in Tukey (1977), Hoaglin, Mosteller, and Tukey (1985), and
Grissom (2000). Although the first two presentations are framed in the language of ex-
ploratory data analysis, you should not have much difficulty following them if you invest a
modest amount of time in learning the terminology.

Logarithmic Transformation


The logarithmic transformation is useful whenever the standard deviation is proportional to
the mean. It is also useful when the data are markedly positively skewed. The easiest
way to appreciate why both of these statements are true is to recall what logarithms do.
(Remember that a logarithm is a power— is the power to which 10 must be raised
to give 25; therefore, 5 1.39794 because .) If we take the numbers
10, 100, and 1000, their logs are 1, 2, and 3. Thus, the distance between 10 and 100, in log
units, is now equivalent to the distance between 100 and 1000. In other words, the right
side of the distribution (more positive values) will be compressed more than will the left
side by taking logarithms. (This is why the salaries of baseball players offer a good example.)
This not only means that positively skewed distributions tend toward symmetry under log-
arithmic transformations; it also means that if a set of relatively large numbers has a large
standard deviation whereas a set of small numbers has a small standard deviation, taking
logs will reduce the standard deviation of the sample with large numbers more than it will
reduce the standard deviation of the sample with small numbers.
Table 11.6 contains an example from the study by Conti and Musty (1984) on activity
levels in rats following administration of THC, the active ingredient in marijuana. I have
reported the activity units (on an arbitrary scale) for each animal over the 10-minute postin-
jection period, whereas Conti and Musty reported postinjection activity as a percentage of
baseline activity. From the data in Table 11.6a you can see that the variances are unequal:
The largest variance is nearly seven times the smallest. This is partly a function of the well-
established fact that drugs tend to increase variability as well as means. Not only are the
variances unequal, but the standard deviations appear to be proportional to the means. This
is easily seen in Figure 11.4a, where I have plotted the standard deviations on the ordinate
and the means on the abscissa. There is clearly a linear relationship between these two

log 10 (25) 10 1.39794= 25

log 10 (25)

338 Chapter 11 Simple Analysis of Variance

Free download pdf