10.1 Text Transformations 229
while (<>) {
chomp;
if (/Motif #([0-9]+):/) {
print "The motif $1 has been found!\n";
}
}
Program 10.11 Extracting information from a file using pattern matching
while (<>) {
chomp;
if (/Motif #([0-9]+):/) {
print "Probability distributions for motif $1\n";
} elsif (/^[0-9]+ /) {
split;
print "A $[1] G $[2] C $[3] T $[4]\n";
}
}
Program 10.12 Extracting an array of data from a file using pattern matching
we want the motif number so the number pattern is parenthesized as in pro-
gram 10.11.
One can have any number of parenthesized subpatterns. The part that matched
the first parenthesized subpattern is$1, the second is$2,andsoon.
The next step in processing the BioProspector file is to find where the motif
probability distributions are located. Looking at the file, one can see that
the probability distributions are located on lines that begin with a number.
Program 10.12 extracts the array. Theˆcharacter means the “beginning of
the line.” The end of the line is denoted by$.
Summary
- Patterns are a powerful mechanism for extracting desired information.
- A pattern specifies the text that a string must have in order to match the
pattern.