[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

The second test’s (.*) groups match and retain any number of characters. The third
and fourth tests shows how alternatives can be grouped by both position and name,
and the last test matches C #define lines—more on this pattern in a moment:


C:\...\PP4E\Lang> python re-groups.py
0 1 2
('000', '111', '222')
('A', 'Y', 'C')
{'a': 'A', 'c': 'C', 'b': 'Y'}
('spam', '1 + 2 + 3')

Finally, besides matches and substring extraction, re also includes tools for string re-
placement or substitution (see Example 19-5).


Example 19-5. PP4E\Lang\re-subst.py


"substitutions: replace occurrences of pattern in string"


import re
print(re.sub('[ABC]', '', 'XAXAXBXBXCXC'))
print(re.sub('[ABC]_', '
', 'XA-XA_XB-XBXC-XC')) # alternatives char + _


print(re.sub('(.) spam', 'spam\1', 'x spam, y spam')) # group back ref (or r'')


def mapper(matchobj):
return 'spam' + matchobj.group(1)


print(re.sub('(.) spam', mapper, 'x spam, y spam')) # mapping function


In the first test, all characters in the set are replaced; in the second, they must be followed
by an underscore. The last two tests illustrate more advanced group back-references
and mapping functions in the replacement. Note the \1 required to escape \1 for
Python’s string rules; r'spam\1' would work just as well. See also the earlier interactive
tests in the section for additional substitution and splitting examples:


C:\...\PP4E\Lang> python re-subst.py
X*X*X*X*X*X*
XA-X*XB-X*XC-X*
spamx, spamy
spamx, spamy

Scanning C Header Files for Patterns


To wrap up, let’s turn to a more realistic example: the script in Example 19-6 puts these
pattern operators to more practical use. It uses regular expressions to find #define and
#include lines in C header files and extract their components. The generality of the
patterns makes them detect a variety of line formats; pattern groups (the parts in pa-
rentheses) are used to extract matched substrings from a line after a match.


Regular Expression Pattern Matching | 1427
Free download pdf