Statistical Analysis for Education and Psychology Researchers

(Jeff_L) #1

A simple procedure for calculating a sample coefficient of skeweness is given by,
3×((mean−median)/standard deviation). SAS uses a slightly more complex formula,
when samples are large estimates produced by the two procedures are usually very
similar.
No matter which coefficient is used the interpretation is the same. If a distribution is
symmetrical, skewness is close to zero. If a distribution is right skewed it has a positive
skewness coefficient and if left skewed a negative coefficient. Caution is required when
interpreting skewness coefficients especially when samples are small, <30 observations.
Knowledge of the skewness coefficient does not provide any information about the shape
of a curve, at best it gives an indication, provided the curve is unimodal, of how
asymmetrical the distribution curve is.
A fourth moment about the mean is sometimes used as an index of shape, this is
kurtosis. This shape coefficient reflects the ‘heaviness’ of tails of a distribution and in a
normal distribution has a value close to zero. Heavier tails are indicated by positive
values of the coefficient and lighter tails have negative coefficients. Kurtosis, similar to
skewness, is an unreliable estimator of the corresponding population parameter when
samples are small. In small samples, you should pay attention only to large values of
these coefficients.


Measures of Dispersion

To describe a distribution we need a measure of spread or dispersion of values as well as
measures of central location and shape. Common statistics which indicate the dispersion
of values are the range, inter-quartile range, and the standard deviation. Less
common is the coefficient of variation.
The range (non inclusive) is the difference between the largest and smallest values in
a distribution. It is simple to calculate and easy to interpret.
A measure of dispersion which conveys more information about the spread of scores is
the inter-quartile range. This is the difference between the third and first quartiles
(Q 3 −Q 1 ). See the box and whisker plot presented in Figure 3.13 and following text for
interpretation of the interquartile range.
The stem and leaf plot provides a convenient way of finding not only the median but
also the upper and lower quartiles of a distribution. Recall, to find the median you count
up from the lowest value (or down from the highest) until you reach the middle value in
the distribution. This is the median. If there are two values in the middle because there
are an even number of observations, the averages of the two centre values is taken. If we
say the number of values from the most extreme value is called the depth, and n is the
total number of observations, in the sample the following general rule can be used to
calculate the median.
When n is odd the median is the unique values at, 1/2(n+1).
When n is even the median is average of the two values at depth 1/2n.
To illustrate this look again at Figure 3.12. Here n is 114. Since this is an even number
we locate the two values at depth 1/2×(114)=57. We count 57 values from the lowest
value. The 57th value is 18.9. The 58th value is 18.9, which is the same as counting 57
values from the highest value. The two centre values are therefore 18.9 and 18.9. The


Statistical analysis for education and psychology researchers 68
Free download pdf