100 SORTING ALGORITHMS
because the above summation begins at 1, where the summation in Equation
1.18 begins at 0.
The run merge phase is O(N lg R). This makes the entire algorithm
■ 3.8.3 Number of Block Reads
Reading large blocks of data is significantly faster than reading each of the
items in the block one after another. For this reason, a polyphase merge sort
will be most efficient if all data is read as larger blocks. We are still, however,
interested in how many blocks of data will be read in this sort.
In the run construction step, we will read one block of data for each run,
resulting in R block reads. Because we can only fit S records in memory, and we
need records from two runs for the merge step, we will read blocks of size S / 2
in the merge phase. Each pass of the merge step will have to read all of the data as
part of some run, meaning that there will be N / (S / 2) = 2R block reads.
Because there are lg R passes, there are 2R lg R block reads in the merge step.
In the entire algorithm, there are R + 2R lg R = O(R lg R) block reads.
3.8.4
- What would be involved in rewriting the external polyphase merge sort
algorithm so that it only used three files instead of four? What impact would
this change have on the number of comparisons and block reads? (Hint: This
new version can’t alternate back and forth in the merge step.) - What would be involved in rewriting the external polyphase merge sort
algorithm so that it used six files instead of four and the merging of three
lists was done simultaneously? What impact would this change have on the
number of comparisons and block reads? Would there be any additional
change if we used eight files and merged four lists simultaneously?
3.9 Additional Exercises
- Selection sort will scan the list of values looking for the largest (or smallest)
key value. This value is swapped into the last (or first) location of the list.
ON()lgSN+ lgR =ON[]*()lgS+lgR
=ON[]*lg()SR* by Equation 1.5
=ON[]*lg()SNS* ⁄
=ON()lgN
■3.8.4 EXERCISES