Analysis of Algorithms : An Active Learning Approach

(Ron) #1
7.3 SIMPLE PARALLEL OPERATIONS 187

This algorithm will first write the data value to location M 1. On the first pass
of the outer loop, P 2 will read the data value and write it to M 2 , and procLoc
becomes 2. The second pass has P 3 and P 4 read from locations M 1 and M 2 and
then write to locations M 3 and M 4 , and procLoc becomes 4. The third pass has
P 5 through P 8 read from locations M 1 through M 4 and then write to locations
M 5 through M 8. You should see that (assuming p is a power of 2) on the second
to last pass, half of the processors will now have the data value and will have
written it to M 1 through Mp/2, which allows the second half of the processors to
read in the data value. Because the read and write can be done in one instruc-
tion cycle as we defined it at the beginning of Section 7.2, the parallel block
does one instruction cycle, and the outer loop executes that block lg p times.
Therefore, this parallel broadcast algorithm does O(lgp) operations.


■ 7.3.3 Finding the Maximum Value in a List
For this and our other operations on lists, we assume that the list has been loaded
into memory locations M 1 through MN. We assume that we have p = N / 2 pro-
cessors. (The case where p < N / 2 will be discussed after the algorithm.)
On the first pass, processor Pi will compare the values in locations M 2 i and
M 2 i+1 and will write the larger of the two into location Mi. On the second pass,
only half of the processors are needed to compare pairs of elements in memory
locationsM 1 through MN/2 and then write the larger of each pair into locations
M 1 through MN/4. This gives the following algorithm:


count = N / 2
for i = 1 to (lg count) + 1 do
Parallel Start
for j = 1 to count do
Pj reads M2j into X and M2j+1 into Y
if X > Y
Pj writes X into Mj
else
Pj writes Y into Mj
end if
end for j
Parallel End
count = count / 2
end for i


We see that each of the passes of this algorithm cuts in half the number of
values that have the potential for being the largest until eventually we are left
with just one value. This is very much like the tournament method used with a

Free download pdf