Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1
What Transformations to Use

Positively skewed or when the standard deviation is proportional to the
mean

There are two possible transformations for positively skewed data, the square root
transformation for moderate +ve skewness and the logarithmic transformation for data
with a severe positive skew. Both transformations ‘pull-in’ the right tail of a distribution.
Skewness is affected by outliers so check these first.
The logarithmic transformation generally uses log 10 (log to the base 10). Log 10 (10)=1,
means the power to which 10 must be raised to give 1. Similarly, log 10 (1000)=3. When
there are a number of zeros in the data set a constant of 0.5 is added to each data value.
The transformation then becomes log 10 (xi+0.5) where xi= original data value. In SAS
code this would be placed in a DATA step as NEWX= LOG10(OLDX+0.5);. If there
were negative values in the data then the largest negative value should be treated as an
absolute value, (|a|), and +0.5 should be added to |a| to make it positive, i.e.,
log10(xi+(|a|+0.5)) where xi=original data value. Log to the basee (e=2.7182...) can be
used rather than log 10 as this has the same transformation effect. Switching from one base
to another only changes the scale, not the shape of a distribution. Figure 5.12 shows a
histogram and normal probability plot for the log transformed variable CORRD—
percentage correct in difficult reading passage. The relevant SAS code that produced this
output is:


data a;
infile 'a:amanda.dat' lrec1= 72;
input id 1–3 corre 57–58 vocab 67–69 corrd 70–72;
newlog=log10 (corrd+0.5);
label corrd = 'Percentage Correct Syntactic Score
(difficult)';
proc print;
var corrd newlog; run;
proc chart;
vbar newlog;
title1 'Distribution of Log Percentage Correct
Syntactic Scores (DIFFICULT)';
run;

proc univariate plot normal;
var newlog;
run;

Choosing a statistical test 151
Free download pdf