represent (1) a point on a distribution, minus (2) the mean of that distribution, all divided by
(3) the standard deviation of the distribution. Now rather than being concerned specifically
with the distribution of , we have re-expressed the sample mean in terms of zscores and
can now answer the question with regard to the standard normal distribution.
From Appendix zwe find that the probability of a zas large as 2.32 is .0102. Because we
want a two-tailed test of , we need to double the probability to obtain the probability of a
deviation as large as 2.58 standard errors in either directionfrom the mean. This is 2(.0102) 5
.0204. Thus, with a two-tailed test (that hospitalized children have a mean behavior problem
score that is different in either direction from that of normal children) at the .05 level of sig-
nificance, we would reject because the obtained probability is less than .05. We would
conclude that we have evidence that hospitalized children differ from normal children in
terms of behavior problems. (In the language of Jones and Tukey (2000) discussed earlier, we
have evidence that the mean of stressed children is above that of other children.)
7.3 Testing a Sample Mean When sIs
Unknown—The One-Sample tTest
The preceding example was chosen deliberately from among a fairly limited number of situ-
ations in which the population standard deviation (s) is known. In the general case, we rarely
know the value of sand usually have to estimate it by way of the samplestandard deviation (s).
When we replace swith sin the formula, however, the nature of the test changes. We can no
longer declare the answer to be a zscore and evaluate it using tables of z. Instead, we will
denote the answer as t and evaluate it using tables of t, which are different from tables of z.
The reasoning behind the switch from zto t is really rather simple. The basic problem that
requires this change to t is related to the sampling distribution of the sample variance.
The Sampling Distribution of s^2
Because the t test uses as an estimate of , it is important that we first look at the sam-
pling distribution of. This sampling distribution gives us some insight into the problems
we are going to encounter. We saw in Chapter 2 that is an unbiasedestimate of , mean-
ing that with repeated sampling the average value of will equal. Although an unbiased
estimator is a nice thing, it is not everything. The problem is that the shape of the sampling
distribution of is positively skewed, especially for small samples. I drew 50,000 samples
of n 5 5 from a population with m 5 5 and s^25 50. I calculated the variance for each sam-
ple, and have plotted those 50,000 variances in Figure 7.4. Notice that the mean of this dis-
tribution is almost exactly 50, reflecting the unbiased nature of s^2 as an estimate of s^2.
However, the distribution is very positively skewed. Because of the skewness of this distri-
bution, an individual value of s^2 is more likely to underestimate s^2 than to overestimate it,
especially for small samples. Also because of this skewness, the resulting value oft is likely
to be larger than the value ofz that we would have obtained had sbeen known and used.
The tStatistic
We are going to take the formula that we just developed for z,
z=
X2m
sX
=
X2m
s
1 n
=
X2m
B
s^2
n
s^2
s^2 s^2
s^2 s^2
s^2
s^2 s^2
H 0
H 0
X
Section 7.3 Testing a Sample Mean When sIs Unknown—The One-Sample tTest 185