Algorithms in a Nutshell

(Tina Meador) #1

(^18) | Chapter 2: The Mathematics of Algorithms
Having an R^2 confidence value so close to 1 declares this is an accurate estimate.
SORT-2 offers the fastest implementation over the given range of points. Its
behavior is characterized by the following trend line equation:
y = 0.05765nlog(n)+7.9653
SORT-2 marginally outperforms SORT-3 initially, and its ultimate behavior is
perhaps 10% faster than SORT-3. SORT-1 shows two distinct behavioral patterns.
For blocks of 39 or fewer strings, the behavior is characterized by:
y = 0.0016n^2 +0.2939n+3.1838
R^2 = 0.9761
However, with 40 or more strings, the behavior is characterized by:
y = 0.0798nlog(n)+142.7818
The numeric coefficients in these equations are entirely dependent upon theplat-
formon which these implementations execute. As described earlier, such
incidental differences are not important. The long-term trend asnincreases domi-
nates the computation of these behaviors. Indeed, Figure 2-2 graphs the behavior
using two different ranges to show that the real behavior for an algorithm may not
be apparent untiln gets large enough.
Algorithm designers seek to understand the behavioral differences that exist
between algorithms. The source code for these algorithms is available from open
source repositories, and it is instructive to see the impact of these designers’
choices on the overall execution. SORT-1 reflects the performance ofqsorton
Linux 2.6.9. When reviewing the source code (which can be found through any of
the available Linux code repositories*), one discovers the following comment:
“Qsort routine from Bentley & McIlroy’s Engineering a Sort Function.” Bentley
and McIlroy (1993) describe how to optimize QUICKSORTby varying the strategy
for problem sizes less than 7, between 8 and 39, and for 40 and higher. It is satis-
fying to see that the empirical results presented here confirm the underlying
implementation.
Analysis in the Best, Average, and Worst Cases
One question to ask is whether the results of the previous section will be true for
all input problem instances. Perhaps SORT-2 is only successful in sorting a small
number of strings. There are many ways the input could change:



  • There could be 1,000,000 strings. How does an algorithm scale to large input?

  • The data could be partially sorted, meaning that almost all elements are not
    that far from where they should be in the final sorted list.

  • The input could contain duplicate values.

  • Regardless of the sizenof the input set, the elements could be drawn from a
    much smaller set and contain a significant number of duplicate values.


*http://lxr.linux.no/linux+v2.6.11/fs/xfs/support/qsort.c

Algorithms in a Nutshell
Algorithms in a Nutshell By Gary Pollice, George T. Heineman, Stanley Selkow ISBN:
9780596516246 Publisher: O'Reilly Media, Inc.


Prepared for Ming Yi, Safari ID: [email protected]
Licensed by Ming Yi
Print Publication Date: 2008/10/21 User number: 594243
© 2009 Safari Books Online, LLC. This PDF is made available for personal use only during the relevant subscription term, subject to the Safari Terms of Service. Any other use

Free download pdf