we will put that off until the next chapter. For now it is sufficient to say that we will often
assume that our data are normally distributed, and superimposing a normal distribution on
the histogram will give us some idea how reasonable that assumption is.^3
Figure 2.4 was produced by SPSS and you can see that while the data are roughly de-
scribed by the normal distribution, the actual distribution is somewhat truncated on the left
and has more than the expected number of observations on the extreme right. The normal
curve is not a terrible fit, but we can do better. An alternative approach would be to create
what is called a kernel density plot.
Kernel Density Plots
In Figure 2.4 we superimposed a theoretical distribution on the data. This distribution only
made use of a few characteristics of the data, its mean and standard deviation, and did not
make any effort to fit the curve to the actual shape of the distribution. To put that a little
more precisely, we can superimpose the normal distribution by calculating only the mean
and standard deviation (to be discussed later in this chapter) from the data. The individual
data points and their distributions play no role in plotting that distribution. Kernel density
plots do almost the opposite. They actually try to fit a smooth curve to the data while at the
same time taking account of the fact that there is a lot of random noise in the observations
that should not be allowed to distort the curve too much. Kernel density plots pay no atten-
tion to the mean and standard deviation of the observations.
The idea behind a kernel density plot is that each observation might have been slightly
different. For example, on a trial where the respondent’s reaction time was 80 hundredths
of a second, the score might reasonably have been 79 or 82 instead. It is even conceivable
22 Chapter 2 Describing and Exploring Data
20
30
40
10
50
Ferquency
Reaction Times
RxTime
40 60 80 100 120
Figure 2.4 Histogram of reaction time data with normal curve superimposed
(^3) This is not the best way of evaluating whether or not a distribution is normal, as we will see in the next chapter.
However it is a common way of proceeding.
kernel density
plot