The lower quartile, Q1, is simply the 25th percentile in a distribution. Similarly, the
median or Q 2 is the 50th percentile, and the upper quartile, Q 3 is the 75th percentile.
The distance between Q 1 and Q 3 is called the interquartile range and contains 50 per cent
of all values in the distribution.
Example 3.7
A box and whisker plot next to a stem and leaf plot for the variable ASCORE1, from the
student A-level data set, is shown in Figure 3.13.
The significance of the quartiles when plotting a box and whisker chart now becomes
apparent. The bottom and top of the box in a box and whisker plot correspond to the
lower quartile and the upper quartile respectively. Thus the length of the box, Q 3 −Q 1 ,
gives a visual image of the spread of the middle 50 per cent of scores in the distribution.
Here 50 per cent of scores are in the range 13 to 18, and we can say that the interquartile
range is (18−13)=5 A-level points.
The heavy line in the middle of the box with an asterisk at each end marks the 50th
percentile or MEDIAN, here Q 2 is 15. The+sign indicates the mean and in this example it
is approximately 15, the same as the median because the + lies on the median line.
Whiskers usually extend from the quartiles up to a distance 1.5 times the interquartile
range, here up to (1.5×5)=7.5 points below Q 1 or 7.5 points above Q 3 , or to the most
extreme points within this range, (Q 1 −7.5) or 5.5 to (Q 3 +7.5) or 25.5. The most extreme
values in this example are 7 and 20. Data values more extreme than 1.5 times the
interquartile range would be plotted with either a zero or an asterisk. If the extreme value
is between 1.5 and 3 times the interquartile range a zero is used as the plotting symbol. If
a value is greater then 3 times the interquartile range then an asterisk is used to plot the
data value. These are the SAS default values. Other statistical computing packages may
have different options and may calculate the quartiles in a slightly different way. Any
differences are however likely to be small.
Stem Leaf # Boxplot
20 00000000000000000 17
19 0000000000 10
18 00000000 8
17 000 3
16 0000 4
15 000000000000000000 18
14 00000000000 11
13 000000000000000 16
12 0000000 7
11 000000000 9
10 000000 6
9 00 2
8 00 2
70 1
Initial data analysis 57