7.5 PARALLEL SORTING 191
had a cost of O(N) and a run time of O(1). At points in between, we will have
p lists that each have N / p elements. This gives a run time of O(lg(N / p)) and
a cost of O(p lg(N / p)). A special case to note is when p = lg N, which means
that lg(N / lg N) = lg N lg(lg N) lg N. This gives a run time of O(lgN),
and a cost of O[(lgN)* (lg N)] = O(lg^2 N). Even though the run time is of
the same order as that of a sequential binary search, the parallel version will
have a smaller constant multiplier than that for the sequential version, so even
though it is not of a smaller order, it will run faster. The cost of O(lg^2 N) is
higher than the optimal sequential cost of O(lgN), but not so much higher as
to be unreasonable.
7.4.1
- The two searches discussed in this chapter implicitly assumed that each ele-
ment in the list was unique. If there are duplicate keys for the target, it is
typical for a search to return the location of the first matching key. Given
the restrictions of a CREW PRAM model, what changes would need to be
made to both searches if there could be duplicates in the list? - Give a parallel search algorithm for a list with duplicates using the CRCW
PRAM model. If there are duplicate keys for the target, it is typical for a
search to return the location of the first matching key. Make sure to specify
your write conflict resolution method. How efficient is your algorithm in
terms of both run time and cost?
7.5 Parallel Sorting
There are a number of ways to do a parallel sort. In this section, we will look at
two ways in detail and discuss in general other techniques that can be used that
are beyond the scope of this text.
■ 7.5.1 Linear Network Sort
We begin by considering a sorting method based on a linear network configu-
ration. If we have the same number of processors as we have data values, we
can sort by passing one data value into the network at each cycle. The first pro-
cessor will read this value, compare it to the current value it holds, and then
pass on the larger of these two values to the next processor. As this continues,
■7.4.1 EXERCISES