Statistical Methods for Psychology

that the score could have been 73 or 86, but it is not at all likely that the score would have been 20 or 100. In other words there is a distribution of alternative possibilities around any obtained value, and this is true for all obtained values. We will use this fact to produce an overall curve that usually fits the data quite well. Kernel estimates can be illustrated graphically by taking an example from Everitt and Hothorn (2006). They used a very simple set of data with the following values for the dependent variable (X). X 0.0 1.0 1.1 1.5 1.9 2.8 2.9 3.5

If you plot these points along the Xaxis and superimpose small distributions representing alternative values that might have been obtained instead of the actual values you have, you obtain the distribution shown in Figure 2.5a. Everitt and Hothorn refer to these small distributions by a technical name: “bumps.” Notice that these bumps are normal distributions, but I could have specified some other shape if I thought that a normal distribution was inappropriate. Now we will literally sum these bumps vertically. For example, suppose that we name each bump by the score over which it is centered. Above a value of 3.8 on the X-axis you have a small amount of bump_2.8, a little bit more of bump_2.9, and a good bit of bump_3.5. You can add heights of these three bumps at X 5 3.8 to get the kernel density of the overall curve at that position. You can do the same for every other value of X. If you do so you find the distribution plotted in Figure 2.5b. Above the bumps we have a squiggly distribution (to use another technical term) that represents our best guess of the distribution underlying the data that we began with. Now we can go back to the reaction time data and superimpose the kernel density function on that histogram. (I am leaving off the bumps as there are too many of them to be leg- ible.) This resulting plot is shown in Figure 2.6. Notice that this curve does a much better job of representing the data than did the superimposed normal distribution. In particular it fits the tails of the distribution quite well. Version 16 of SPSS fits kernel density plots using syntax, and you can fit them using SAS and S-Plus (or its close cousin R). It is fairly easy to find examples for those programs on the Internet. As psychology expands into more areas, and particularly into the

Section 2.3 Fitting Smooth Lines to Data 23

2.0

1.5 Y(

X )

X

1.0

0.5

2.5

0 –1 01234

2.0

1.5 Y(

X )

X

1.0

0.5

2.5

0 –1 01234

Figures 2.5a and 2.5b Illustration of the kernel density function for X

Statistical Methods for Psychology

Get our desktop app

Company

Features

Documentation

Resources