138 MATCHING ALGORITHMS
sample of this for the substring “trim” and the text “try the trumpet” is given
in Fig. 5.9.
If we look at the bottom row for the character y in the text, we see the
value 2, which represents the fact that to match “trim” so that it ends at the y
would require two differences of the kind discussed before. Those two differ-
ences would represent an m missing from the text after the y and the mismatch
of the y with the i or an i missing from the text before the y and the mismatch
of the y with the m. So, the bottom row gives us the best possible matches of
the substring ending at that point in the text. We see that the closest match of
“trim” in the text, with one difference, would end at the m in trumpet and
represents the mismatch of the i and u.
If this process were used in practice, we would specify not only the substring
and text but also the maximum number of differences for which we were
looking. The algorithm would fill in the matrix column by column until the
bottom value of a column was less than or equal to the number given. This
means that the algorithm does not need to store S*T integers for this matrix
(where S is the number of characters in the substring and T is the number of
characters in the text), but rather it just needs to store 2S integers for the col-
umn being calculated and for the previous column on which it depends.
This style of algorithm is classified as “dynamic programming,” which will
be discussed again in Chapter 9.
■ 5.2.1 Analysis
This process is easy to analyze because of the nature of the matrix. We see that
for each location in the matrix, we do one character comparison. This means
that in the worst case there will be S*T comparisons. Notice that even with
all of the possible differences that could occur, this process operates as effi-
ciently as the straightforward exact string match algorithm.
t
tt th eet
r
rrumy p
i
m
0
1
2
3
4
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
0
1
2
3
0
1
0
1
2
0
1
0
1
2
0
1
1
1
2
0
1
1
1
2
0
1
1
2
3
0
1
2
2
2
0
1
2
2
3
0
1
2
2
1
0
1
2
3
3
0
1
2
3
3
0
1
2
3
2
■FIGURE 5.9
The diffs matrix for
the substring “trim”
and the text “try
the trumpet”