important to avoid Type I errors (such as falsely claiming that the average driver is rude),
then you would set a stringent (i.e., small) level of a. If, on the other hand, you want to
avoid Type II errors (patting everyone on the head for being polite when actually they are
not), you might set a fairly high level of a. (Setting a5.20 in this example would reduce
bto .46.) Unfortunately, in practice most people choose an arbitrary level of a, such as .05
or .01, and simply ignore b. In many cases this may be all you can do. (In fact you will
probably use the alpha level that your instructor recommends.) In other cases, however,
there is much more you can do, as you will see in Chapter 8.
I should stress again that Figure 4.3 is purely hypothetical. I was able to draw the fig-
ure only because I arbitrarily decided that the population means differed by 2 units, and the
standard deviation of each population was 15. The answers would be different if I had cho-
sen to draw it with a difference of 2.5 and/or a standard deviation of 10. In most everyday
situations we do not know the mean and the variance of that distribution and can make only
educated guesses, thus providing only crude estimates of b. In practice we can select a
value of munder that represents the minimumdifference we would like to be able to de-
tect, since larger differences will have even smaller bs.
From this discussion of Type I and Type II errors we can summarize the decision-
making process with a simple table. Table 4.1 presents the four possible outcomes of an
experiment. The items in this table should be self-explanatory, but there is one concept—
power—that we have not yet discussed. The powerof a test is the probability of rejecting
when it is actually false. Because the probability of failingto reject a false is b, then
power must equal 12 b.Those who want to know more about power and its calculation
will find power covered in Chapter 8.
4.8 One- and Two-Tailed Tests
The preceding discussion brings us to a consideration of one- and two-tailed tests. In our
parking lot example we were concerned if people took longer when there was someone
waiting, and we decided to reject only if a those drivers took longer. In fact, I chose that
approach simply to make the example clearer. However, suppose our drivers left 16.88 sec-
onds soonerwhen someone was waiting. Although this is an extremely unlikely event to
observe if the null hypothesis is true, it would not fall in the rejection region, which con-
sisted solelyof long times. As a result we find ourselves in the position of not rejecting
in the face of a piece of data that is very unlikely, but not in the direction expected.
The question then arises as to how we can protect ourselves against this type of situa-
tion (if protection is thought necessary). One answer is to specify before we run the experi-
ment that we are going to reject a given percentage (say 5%) of the extremeoutcomes, both
those that are extremely high and those that are extremely low. But if we reject the lowest
5% and the highest 5%, then we would in fact reject H 0 a total of 10% of the time when it
H 0
H 0
H 0 H 0
H 1
Section 4.8 One- and Two-Tailed Tests 99
Table 4.1 Possible outcomes of the decision-making process
True State of the World
Decision H 0 True H 0 False
RejectH 0 Type I error p5a Correct decision p 5 1 – b5Power
Don’t rejectH 0 Correct decision p 5 1 – a Type II error p5b
power