Algorithms in a Nutshell

(Tina Meador) #1
Hash-based Search | 127

Searching

worst performing timings were discarded and the table contains the average of the
remaining 98 trials. The designers of this class perform extra computation while
the hash table is being constructed to improve the performance of searches (a
common tradeoff to make). In columns 3 and 5 of Table 5-7, there is a noticeable
cost penalty when a rehash occurs. Also note that in the last two rows the hash
tables do not rehash themselves, so the results in columns 3, 5, and 7 are nearly
identical. Rehashing while building the hash table improves the overall perfor-
mance by reducing the average length of the element chains.

Variations


One main variation of HASH-BASEDSEARCHis to modify the way collisions are
handled. Instead of placing into a linked list all elements that hash to a slot in the
hash table, we can use the technique known asopen addressing, which stores the
colliding items directly in the hash tableA. This approach is shown in Figure 5-7.
With open addressing the hash table reduces storage overhead, such as pointers to
the next element in a list for collisions. To use open addressing, we change the
hash function to take two arguments,h(u,j)=i, whereu∈Uandiandjare integers
in the range [0,b), in whichbis the size ofA. In general we leth(u,0)=i=h’(u)
whereh’is a hash function, such as previously described. If the slot atA[i] is occu-
pied and the item does not match the sought-for item, we calculate the value of
h(u,1). If that slot is occupied and does not match the sought-for item, we consider
sloth(u,2) and repeat until the item is found, the slot is empty, or the hash function
produces an index whose slot we have already visited (which indicates a failure).

Table 5-7. Comparable times (in milliseconds) to build hash tables

b

Our hash
table JDK hash table (α=.75f ) JDK hash table (α=4.0f )

JDK
hash table
(α=n/b)
no rehash
Build Time Build Time #Rehash Build Time #Rehash Build Time
4,095 1116.08 165.77 7 166.81 4 644.31
8,191 647.32 162.96 6 165.48 3 364.56
16,383 421.57 164.20 5 162.85 2 230.42
32,767 332.82 164.36 4 149.29 1 164.08
65,535 273.77 155.62 3 131.19 0 131.05
13,1071 256.39 147.17 2 118.47 0 116.07
262,143 280.06 127.57 1 90.20 0 91.09
524,287 264.87 89.93 0 89.77 0 89.61
1,048,575 257.83 92.09 0 93.55 0 92.65

Figure 5-7. Open addressing

Algorithms in a Nutshell
Algorithms in a Nutshell By Gary Pollice, George T. Heineman, Stanley Selkow ISBN:
9780596516246 Publisher: O'Reilly Media, Inc.


Prepared for Ming Yi, Safari ID: [email protected]
Licensed by Ming Yi
Print Publication Date: 2008/10/21 User number: 594243
© 2009 Safari Books Online, LLC. This PDF is made available for personal use only during the relevant subscription term, subject to the Safari Terms of Service. Any other use

Free download pdf