Statistical Methods for Psychology

(Michael S) #1
derived mathematically, but it is easier to understand what they represent if we consider
how they could, in theory, be derived empirically with a simple sampling experiment.
We will take as an illustration the sampling distribution of the differences between
means,because it relates directly to our example of waiting times in parking lots. The sam-
pling distribution of differences between means is the distribution of differences between
means of an infinite number of random samples drawn under certain specified conditions
(e.g., under the condition that the true means of our populations are equal). Suppose we
have two populations with known means and standard deviations (Here we will suppose
that the two population means are 35 and the population standard deviation is 15, though
what the values are is not critical to the logic of our argument. In the general case we rarely
know the population standard deviation, but for our example suppose that we do.) Further
suppose that we draw a very large number (theoretically an infinite number) of pairs of ran-
dom samples from these populations, each sample consisting of 100 scores. For each sam-
ple we will calculate its sample mean and then the difference between the two means in
that draw. When we finish drawing all the pairs of samples, we will plot the distribution of
these differences. Such a distribution would be a sampling distribution of the difference be-
tween means. I wrote a 9 line program in R to do the sampling I have described, drawing
10,000 pairs of samples of n 5 100 from a population with a mean of 35 and a standard
deviation of 15 and computing the difference between means for each pair. A histogram of
this distribution is shown on the left of Figure 4.1 with a Q-Q plot on the right. I don’t think
that there is much doubt that this distribution is normally distributed. The center of this dis-
tribution is at 0.0, because we expect that, on average, differences between sample means
will be 0.0. (The individual means themselves will be roughly 35.) We can see from this
figure that differences between sample means of approximately 2 3 to 1 3, for example,
are quite likely to occur when we sample from identical populations. We also can see that
it is extremely unlikely that we would draw samples from these populations that differ by
10 or more. The fact that we know the kinds of values to expect for the difference of means
of samples drawn from these populations is going to allow us to turn the question around
and ask whether an obtained sample mean difference can be taken as evidence in favor of
the hypothesis that we actually are sampling from identical populations—or populations
with the same mean.

Section 4.2 Sampling Distributions 89

Q-Q plot for normal sample

10,000 samples representing
Ruback and Juieng study

Obtained quantiles

–2 –1 021

Expected quantiles

–2

–1

0

1

2

–6 –4 –2^024
Difference in mean waiting times

6

800

600

400

200

Frequency

0

Obtained
mean

Figure 4.1 Distribution of difference between means, each based on 25 observations

sampling
distribution of
the differences
between means

Free download pdf