chance level? Put another way, are we likely to have seven out of eight correct choices if
the judge is really operating by blind guessing?
Following the procedure outlined in Chapter 4, we can begin by stating as our research
hypothesis that the judge knows a digit when she sees it (at least that is presumably what
we set out to demonstrate). In other words, the researchhypothesis ( )is that her per-
formance is at better than chance levels (p..50). (We have chosen a one-tailed test merely
to simplify the example; in general, we would prefer to use a two-tailed test.) The nullhy-
pothesis is that the judge’s behavior does not differ from chance ( ). The sam-
pling distribution of the number of correct choices out of eight trials, given that the null
hypothesis is true, is provided by the binomial distribution with p 5 .50. Rather than
calculate the probability of each of the possible number of correct choices (as we did in
Figure 5.5, for example), all we need to do is calculate the probability of seven correct
choices and the probability of eight correct choices, since we want to know the probability
of our judge doing at leastas well as she did if she were choosing randomly.
Letting Nrepresent the number of trials (eight) and Xrepresent the number of correct
trials, the probability of seven correct trials out of eight is given by
Thus, the probability of making seven correct choices out of eight by chance is .0312. But
we know that we test null hypotheses by asking questions of the form, “What is the proba-
bility of at leastthis many correct choices if is true?” In other words, we need to sum
p(7) and p(8):
Then
Here we see that the probability of at least seven correct choices is approximately .035.
Earlier, we said that we will reject whenever the probability of a Type I error (a) is less
than or equal to .05. Since we have just determined that the probability of making at least
seven correct choices out of eight is only .035 if is true (i.e., if p 5 .50), we will reject
and conclude that our judge is performing at better than chance levels. In other words,
her performance is better than we would expect if she were just guessing.^4
The Sign Test
Another example of the use of the binomial to test hypotheses is one of the simplest tests
we have: the sign test.Although the sign test is very simple, it is also very useful in a
H 0
H 0
H 0
p(7 or 8)=.0351
1 p(8)=.0039
p(7)=.0312
p(8)=C^88 p^8 q^0 =1(.0039)(1)=.0039
H 0
=
8!
7!1!
(.5)^7 (.5)^1 =8(.0078)(.5)=8(.0039)=.0312
p(7)=C^87 p^7 q^1
p(X)=CNX pX^ q(N^2 X)
H 0 :p=.50
H 1
132 Chapter 5 Basic Concepts of Probability
(^4) One problem with discrete distributions is that there is rarely a set of outcomes with a probability of exactly .05.
In our particular example with 7 correct guesses you rejected the null because p 5 .035. If we had found 6 correct
choices the probability would have been .133, and we would have failed to reject the null. There is no possible
outcome with a tail area of exactly .05. So we are faced with the choice of a case where the critical value is either
too conservative or too liberal. One proposal that has been seriously considered is to use what is called the “mid-p”
value, which takes one half of the probability of the observed outcome, plus all of the probabilities of more extreme
outcomes. For a discussion of this approach see Berger (2005).
sign test