Analysis of Algorithms : An Active Learning Approach

(Ron) #1

132 MATCHING ALGORITHMS


If you trace this algorithm with the pattern “datadata,” you will find that
slide[d] = 3,slide[a] = 0, and slide[t] = 1, and the slide value is 8
for all other characters.

The Jump Array

We will create a second array called jump (the same size as our pattern) that
will encode information about the pattern relative to itself. This new array will
be able to let us know, for example, that when the h and t in Fig. 5.5 don’t
match, we need to move the pattern completely past where we currently are.
This new array will also know if there are repetitions of the characters at the
end of the pattern that might be good alternates for a match. For example, let’s
say our pattern is “abcdbc” and in the matching process we were able to match
the last two characters of the pattern with the text. If we now fail on the third
to last character, our jump array will tell us how much to shift the pattern so
the next time the text characters that matched the “bc” in positions 5 and 6 of
the pattern are now lined up with the “bc” in positions 2 and 3 of the pattern.
So, the jump array tells us the smallest move necessary to line up characters we
have already matched with the next place they appear in the pattern.
Let’s say that we have a pattern where a mismatch of a character means that
the pattern has to be shifted all the way past where we started. In Fig. 5.7, we
show a piece of text and a pattern. The X symbols in the pattern could be any
character and are used to help illustrate the process.
If we need to move the pattern completely past where we started, we would
want the X symbols to line up with the text from f to j, meaning that the new
starting value for textLoc would be 10. If the mismatch occurs at the e char-
acter when textLoc is 5, we would need to add 5 to textLoc to reposition
the pattern. If this mismatch occurs at the d character (textLoc = 4), we
would need to add 6. With a mismatch at characters c, b, or a, we would need

■FIGURE 5.7
Determining a
jump value

Text: abcdefghijklmn
Pattern: XXXXX
Free download pdf