Basic Statistics

(Barry) #1

increases as P is decreased. That is, taking the logarithm of X reduces the amount
of skewness to the right more than taking the square root transformation (0 is < i);
taking the reciprocal of X or P = -1 reduces it even more.
With a distribution skewed to the left, we should begin with a value of P > 1 and
if necessary try larger values of P until the skewness is sufficiently reduced. After
each attempted transformation, a normal probability plot, histogram or stem and leaf
plot, or a box plot should be examined.

6.5.2 Assessing the Need for a Transformation

An awkward question is whether data that appears to be nonnormal needs to be
transformed. Most investigators do not use a transformation for slight departures
from normality.
Several rules of thumb have been suggested to assist in deciding if a transforma-
tion will be useful. Two of these rules should be used only for ratio data. If the
standard deviation divided by the mean is < a, it is considered less necessary to use a
transformation. For example, for the systolic blood pressure data from Problem 5.2,
the standard deviation is 32.4 and the mean is 137.3, so the coefficient of variation
is 32.41137.3 = .24. This would be an indication that even if the points in Figure
6.12 do not follow a straight line, it is questionable if a transformation is needed.
An alternative criterion is if the ratio of the largest to the smallest number is < 2,
a transformation may not be helpful. Here the ratio is 230187 = 2.6, so perhaps a
transformation is helpful, but it again seems borderline.
One reason for the reluctance to use a transformation is that for many readers,
it makes the information harder to interpret. Also, users wishing to compare their
information with that of another researcher who has not used a transformation will
find the comparison more difficult to make if they transform their data. For the same
reason, researchers often try to use a transformation that has already been used on a
particular type of data. For example, logarithms of doses are often used in biomedical
Finding an appropriate transformation can also provide additional information
concerning the data. When, for instance, a logarithmic transformation results in near-
normal data we know that our original data follows a skewed distribution called a
lognormal distribution.
Often, researchers find the best possible transformation for their data, and then
perform their analyses both with and without the transformation used and see if the
transformation affects the final results in any appreciable fashion. If it does not, the
transformation may not be used in the final report.
Note that the statistical programs often include the option of computing trans-
formed data after the original data are entered into the program. Minitab, SAS,
SPSS, and Stata all include some transformation options.

6.1 If data are normally distributed, what does this tell you about the magnitudes
of the mean, median, and mean?
Free download pdf