190 PARALLEL ALGORITHMS
If we have the same number of processors as elements in the list (p = N), we
can have each processor compare the target to one of the list items. If there is a
match, we can have the processor that found the match write its location to
some place special in memory. The following algorithm expects that the list
will be in locations M 1 through MN, the target will be in location MN+1, and
the place where it is found will be written into location MN+2.
Parallel Start
for j = 1 to N do
Pj reads X from Mj and target from MN+1
if X = target then
write j to MN+2
end if
end for
Parallel End
Because we assume that the empty memory cells are all zero at the start of
the algorithm, if there is no match, MN+2 will have a value of zero indicating
that the search failed. If the search succeeded, the one processor that found the
match will write its location into MN+2.
This algorithm does one read/process/write cycle for each of the N proces-
sors, giving a run time of O(1) and a cost of O(N). You should recall from
Chapter 2 that our optimal sequential search was a binary search that had a cost
ofO(lgN).
Our alternative search provides us with a scalable algorithm that can vary in
cost and speed based on the number of processors that we can make available
to it. This gives us a clear illustration of the scalability issue discussed in Section
7.1.3.
Using p ≤ N processors
Parallel Start
for j = 1 to p do
Pj performs a sequential binary search on M(j-1)(N/p)+1 through Mj(N/p)
writing the location where X is found to MN+2
end for
Parallel End
If we use one processor, notice that the range of memory cells considered is
from M 1 through MN, which is the entire list. So, with one processor we have
a sequential binary search. We, therefore, have a cost of O(lgN) and a run time
ofO(lgN). If we use N processors, we are back at the previous case where we