If an interval width of 5 was chosen this would give 11.4/5=2.28 or 3 class intervals
rounded up to the nearest integer. This is too few. Suppose an interval width of three is
selected then the range of 11.4 will be covered by 11.4/3=3.8 or 4 class intervals rounded
up to the nearest integer. Looking at the age distribution in Figure 3.12 most of the ages
are between 18 and 20, and 4 class intervals would only be acceptable if the data were
more evenly distributed. This is not so here, 4 intervals would not show enough detail in
the middle of the distribution where values are bunched. It should however be borne in
mind that a class interval frequency table presents a summary of the data distribution and
judgment should be used in choosing either sufficient class intervals to show any
variation or to use an alternative data display method. In this example I suggest using a
stem and leaf plot if a visual impression of the data distribution was required. For
illustrative purposes a grouped frequency table will be constructed with 6 class intervals.
This is based on an interval width of 2, 11.4/2=6 to the nearest integer.
The next step is to determine the first interval, and the upper and lower stated limits
for the interval. All subsequent intervals can then be completed and frequencies and
relative frequencies for each class interval evaluated.
The first interval must obviously contain the minimum data value in the range. It is
desirable to ensure that the minimum data value in a distribution is evenly divisible by
the width of the interval. Since in Figure 3.12 the minimum value of 16.8 is evenly
divisible by 2 we can select the lower stated limit of the lowest class interval to be 17,
and the upper stated limit of the first class interval would be 18.
It may seem odd that the lowest interval begins at the stated limit of 17 when there is
one data value of 16.8, and that the size of the interval 17–18 is 2 and not 1. If however,
the integer intervals are listed, there are 2 of them, e.g. 17 and 18. The stated limit of 17
has a lower real limit of 16.5 and an upper real limit of 18.5. The width of the class
interval is determined by subtracting its lower real limit from its upper real limit. So, the
class interval of 17–18 is (18.5−16.5)=2. The stated limits and real limits are shown
diagrammatically in Figure 3.14.
Figure 3.14: Stated limits and real
limits for the two lowest score intervals
in Table 3.4
The format procedure in SAS is a convenient way of specifying ranges of variables (see
Example 3.8). The procedure automatically changes overlapping range values to be
Statistical analysis for education and psychology researchers 60