Analysis of Algorithms : An Active Learning Approach

(Ron) #1

130 MATCHING ALGORITHMS


appears “closer” to the front than the repetition of the end characters of the
pattern. If the jump array is larger, it means that the end characters of the pat-
tern that matched appear closer to the front of the pattern than the mis-
matched character. In either of these two cases, we should use the larger shift,
because the smaller shift will definitely fail again because of what we know
from the other value. For example, if the slide array has a value of 4 and the
jump array has a value of 2, if we just move by two characters, we know this
will fail because the mismatched character is still not lined up. But if we move
by the four characters, the mismatched character will line up correctly in the
pattern, and there is a possibility that the end matching characters might match
in their new location as well.
Because we are just taking the larger of the two values, the algorithm is
rather simple:

textLoc = length(pattern)
patternLoc = length(pattern)
while (textLoc ≤ length(text)) and (patternLoc > 0) do
if text[ textLoc ] = pattern[ patternLoc ] then
textLoc = textLoc - 1
patternLoc = patternLoc - 1
else
textLoc = textLoc + MAXIMUM(slide[text[textLoc]], jump[patternLoc])
patternLoc = length(pattern)
end if
end while


if patternLoc = 0 then
return textLoc + 1 // found a match
else
return 0
end if


The Slide Array

We now look at the operation of the slide array. In Fig. 5.6(a), we begin the
match with textLoc = 6 and patternLoc = 6. Because those two charac-
ters match, both textLoc and patternLoc are decremented. The next two
characters also match, so textLoc and patternLoc are both decremented
again and now have a value of 4. There is a mismatch in the next character. We
want to slide the pattern so that the b character of the text lines up with the b
Free download pdf