Statistical Methods for Psychology

(Michael S) #1
and because that is standard practice in their area of research. A case might be made, how-
ever, that a logarithmic transformation of the original units might be a more appropriate
one for future analyses, especially if problems occur with respect to either the shapes of
the distributions or heterogeneity of variance.
As I noted earlier, it makes no difference what base you use for a logarithmic transfor-
mation, and most statisticians tend to use loge. Regardless of the base, however, there are
problems when the original values ( ) are negative or near zero, because logs are only
defined for positive numbers. In this case, you should add a constant to make all Xvalues
positive before taking the log. In general, when you have near-zero values, you should use
instead of. If the numbers themselves are less than –1, add whatever
constant is necessary to make them all greater than zero.

Square-Root Transformation


When the data are in the form of counts (e.g., number of bar presses), the mean is often
proportional to the variancerather than to the standard deviation. In this case,
is sometimes useful for stabilizing variances and decreasing skewness. If the values of X
are fairly small (i.e., less than 10), then or is of-
ten better for stabilizing variances. For the Conti and Musty data, the mean correlates
nearly as well with the variance as it does with the standard deviation. Standard devia-
tions and variances are themselves highly correlated if the range of values is not large (in
this case ). In practice it is almost impossible to distinguish by eye a relation-
ship between the mean and a standard deviation and the relationship between the mean
and the variance. Therefore, you might want to investigate how a square-root transfor-
mation affects the data.

Reciprocal Transformation


When you have a distribution with very large values in the positive tail, a reciprocal trans-
formation may dramatically reduce the influence of those extreme values. For example, an-
imals in a maze or straight alley often seem to forget their job and stop to sniff at all the
photocells and such that they find along the way. Once an animal has been in the apparatus
for 30 seconds, it does not matter to us if he takes another 300 seconds to complete the run.
One approach was referred to in Chapter 2—if there are several trials per day, you might
take the daily median time as your measure. An alternative approach is to use all of the data
but to take the reciprocal of time (i.e., speed), because it has the effect of nearly equating
long times. Suppose that we collected the following times:
[10, 11, 13, 14, 15, 45, 450]

rs#s^2 =.99

Y= 2 X 1 0.5 Y= 2 X 12 X 11


Y= 2 X


log (Xi 1 1) log (Xi)

Xi

340 Chapter 11 Simple Analysis of Variance


175 0.35

0.30

0.25

0.20

0.15

Standard deviation Standard deviation

155
135
115
95
75
55
100 150 200 250
Mean

1.75 2.00 2.25 2.50 2.75
Mean

Valid cases: 5 Missing cases: 0 Valid cases: 5 Missing cases: 0
300 350 400

Figure 11.4 The relationship between means and standard deviations for original
and transformed values of the data in Table 11.6

a) Plot for raw data. b) Plot for log-transformed data.
Free download pdf