Algorithms in a Nutshell

(Tina Meador) #1
Experiment if Necessary | 7

Algorithms
Matter

Allocating and deallocating memory was not the issue, so the problem must be in
the bookkeeping code Graham wrote to keep track of the memory.
Gary asked Graham how he kept track of the allocated memory. Graham replied
that he was using a binary tree where each node was a structure that consisted of
pointers to the children nodes (if any), the address of the allocated memory, the
size allocated, and the place in the program where the allocation request was
made. He added that he was using the memory address as the key for the nodes
since there could be no duplicates, and this decision would make it easy to insert
and delete records of allocated memory.
Using a binary tree is often more efficient than simply using an ordered linked list
of items. If an ordered list ofnitems exists—and each item is equally likely to be
sought—then a successful search uses, on average, aboutn/2 comparisons to find
an item. Inserting into and deleting from an ordered list requires one to examine
or move aboutn/2 items on average as well. Computer science textbooks would
describe the performance of these operations (search, insert, and delete) as being
O(n), which roughly means that as the size of the list doubles, the time to perform
these operations also is expected to double.*
Using a binary tree can deliver O(logn) performance for these same operations,
although the code may be a bit more complicated to write and maintain. That is,
as the size of the list doubles, the performance of these operations grows only by a
constant amount. When processing 1,000,000 items, we expect to examine an
average of 20 items, compared to about 500,000 if the items were contained in a
list. Using a binary tree is a great choice—if the keys are distributed evenly in the
tree. When the keys are not distributed evenly, the tree becomes distorted and
loses those properties that make it a good choice for searching.
Knowing a bit about trees and how they behave, Gary asked Graham the $64,
(it is logarithmic, after all) question: “Are you balancing the binary tree?”
Graham’s response was surprising, since he was a very good software developer.
“No, why should I do that? It makes the code a lot more complex.” But the fact
that Graham wasn’t balancing the tree was exactly the problem causing the
horrible performance of his code. Can you figure out why? Themalloc()routine
in C allocates memory (from the heap) in order of increasing memory addresses.
Not only are these addresses not evenly distributed, the order is exactly the one
that leads to right-oriented trees, which behave more like linear lists than binary
trees. To see why, consider the two binary trees in Figure 1-2. The (a) tree was
created by inserting the numbers 1–15 in order. Its root node contains the value 1
and there is a path of 14 nodes to reach the node containing the value 15. The (b)
tree was created by inserting these same numbers in the order <8, 4, 12, 2, 6, 10,
14, 1, 3, 5, 7, 9, 11, 13, 15>. In this case, the root node contains the value 8 but
the paths to all other nodes in the tree are three nodes or less. As we will see in
Chapter 5, the search time is directly affected by the length of the maximum path.

* Chapter 2 contains information about this “big O” notation.

Algorithms in a Nutshell
Algorithms in a Nutshell By Gary Pollice, George T. Heineman, Stanley Selkow ISBN:
9780596516246 Publisher: O'Reilly Media, Inc.


Prepared for Ming Yi, Safari ID: [email protected]
Licensed by Ming Yi
Print Publication Date: 2008/10/21 User number: 594243
© 2009 Safari Books Online, LLC. This PDF is made available for personal use only during the relevant subscription term, subject to the Safari Terms of Service. Any other use

Free download pdf