Interpretation
The youngest student is 16.8 years and is represented by a stem of 16 and a leaf of 8.
There are 114 values in the distribution (the sum of the number of values in each leaf).
The middle value in the distribution or median is between the 57th and 58th value,
counting up from the lowest value of 16.8. When, as in this case, there is an even number
of data values, the middle value is taken as the average of the two centre values. This
gives an approximate median for the distribution. When counting either up from the
bottom or down from the top be careful to count from the correct end of the leaves. For
example, in order, the three lowest values in the distribution are 16.8, 17.8 and 18.1.
There are more precise methods of estimating the median but this method is adequate
and seldom differs very much from more precise procedures. In this example, the 57th
value is 18.9 and the 58th value is also 18.9. The median is therefore (18.9+18.9)/2=18.9.
A stem and leaf plot seldom appears in journal articles probably because it is seen more
as a heuristic device to examine the shape of distributions and to arrive at a ‘feel’ for the
data. A particular advantage with this plot procedure is that it allows a quick and easy
calculation of the quartiles of a distribution, that is the lower quartile or 25th percentile,
the median or 50th percentile and the upper quartile or 75th percentile. Quartiles and their
use in describing the characteristics of a distribution are referred to in a later section
(3.4).
Box and Whisker Plot
An effective way to compare distributions of continuous (interval or ratio) data is with a
box and whisker plot. Strictly, the plot is appropriate for continuous data only but is often
used with count data provided there are a reasonable number of distinct data values. The
main discriminatory features of a box and whisker plot are the length of the box from top
to bottom and the length that the whiskers extend from the ends of the box.
To understand the significance of a box and whisker plot you first have to be familiar
with percentiles. A percentile is a measure of relative standing in a distribution of scores.
Often in psychology and education we are concerned with comparing individual
scores either for different students or for the same student on different tests. If you want
to make a comparison of a student’s performance on two different tests you need a
measure of relative standing on each of the tests. That is, a student’s test score relative to
the distribution of scores for all other students who completed the test (or whatever
reference group is appropriate). The percentile rank provides a convenient measure of
relative standing in a group.
To calculate a percentile, data values or scores are arranged in ascending order of
magnitude using for example a stem and leaf plot. The required per cent is then counted
up from the smallest score or data value. The 0th percentile is the smallest score in the
distribution, the 25th percentile is a score which is larger than 25 per cent of the total
distribution of scores, put simply 25 per cent of scores would lie at or below the 25th
percentile. The 50th percentile or median is a score which is larger than 50 per cent of the
scores in the distribution, half the scores are below the median and half are above. The
75th percentile is the score which is larger than 75 per cent of scores, or in other words,
75 per cent of the scores would fall at or below it. The 100th percentile is the largest
score in the distribution.
Statistical analysis for education and psychology researchers 56