neurosciences and health sciences, techniques like kernel density plots are becoming more
common. There are a number of technical aspects behind such plots, for example the shape
of the bumps and the bandwidth used to create them, but you now have the basic informa-
tion that will allow you to understand and work with such plots.
2.4 Stem-and-Leaf Displays
Although histograms, frequency distributions, and kernel density functions are commonly
used methods of presenting data, each has its drawbacks. Because histograms often portray
observations that have been grouped into intervals, they frequently lose the actual numeri-
cal values of the individual scores in each interval. Frequency distributions, on the other
hand, retain the values of the individual observations, but they can be difficult to use when
they do not summarize the data sufficiently. An alternative approach that avoids both of
these criticisms is the stem-and-leaf display.
John Tukey (1977), as part of his general approach to data analysis, known as
exploratory data analysis (EDA),developed a variety of methods for displaying data in vi-
sually meaningful ways. One of the simplest of these methods is a stem-and-leaf display,
which you will see presented by most major statistical software packages. I can’t start with
the reaction time data here, because that would require a slightly more sophisticated display
due to the large number of observations. Instead, I’ll use a hypothetical set of data in which
we record the amount of time (in minutes per week) that each of 100 students spends play-
ing electronic games. Some of the raw data are given in Figure 2.7. On the left side of the
figure is a portion of the data (data from students who spend between 40 and 80 minutes per
week playing games) and on the right is the complete stem-and-leaf display that results.
From the raw data in Figure 2.7, you can see that there are several scores in the 40s, an-
other bunch in the 50s, two in the 60s, and some in the 70s. We refer to the tens’ digits—
here 4, 5, 6, and 7—as the leading digits(sometimes called the most significant digits)
for these scores. These leading digits form the stem,or vertical axis, of our display. Within
the set of 14 scores that were in the 40s, you can see that there was one 40, two 41s, one
42, two 43s, one 44, no 45s, three 46s, one 47, one 48, and two 49s. The units’ digits 0, 1,
24 Chapter 2 Describing and Exploring Data
50
40
30
20
10
0
40 60
RxTime
Histogram of RxTime
80 100 120
Figure 2.6 Kernel density plot for data on reaction time
stem-and-leaf
display
leading digits
most significant
digits
stem
exploratory data
analysis (EDA)