Table 19-2. re special sequences
Sequence Interpretation
\number Matches text of group number (numbered from 1)
\A Matches only at the start of the string
\b Empty string at word boundaries
\B Empty string not at word boundaries
\d Any decimal digit character ([0-9] for ASCII)
\D Any nondecimal digit character ([^O-9] for ASCII)
\s Any whitespace character ([ \t\n\r\f\v] for ASCII)
\S Any nonwhitespace character ([^ \t\n\r\f\v] for ASCII)
\w Any alphanumeric character ([a-zA-Z0-9_] for ASCII)
\W Any nonalphanumeric character ([^a-zA-Z0-9_] for ASCII )
\Z Matches only at the end of the string
Most of the standard escapes supported by Python string literals are also accepted by
the regular expression parser: \a, \b, \f, \n, \r, \t, \v, \x, and \. The Python library
manual gives these escapes’ interpretation and additional details on pattern syntax in
general. But to further demonstrate how the re pattern syntax is typically used in scripts,
let’s go back to writing some code.
More Pattern Examples
For more context, the next few examples present short test files that match simple but
representative pattern forms. Comments in Example 19-3 describe the operations ex-
ercised; check Table 19-1 to see which operators are used in these patterns. If they are
still confusing, try running these tests interactively, and call group(0) instead of
start() to see which strings are being matched by the patterns.
Example 19-3. PP4E\Lang\re-basics.py
"""
literals, sets, ranges, alternatives, and escapes
all tests here print 2: offset where pattern found
"""
import re # the one to use today
pattern, string = "A.C.", "xxABCDxx" # nonspecial chars match themselves
matchobj = re.search(pattern, string) # '.' means any one char
if matchobj: # search returns match object or None
print(matchobj.start()) # start is index where matched
pattobj = re.compile("A.C.") # 'R*' means zero or more Rs
matchobj = pattobj.search("xxABCDxx") # compile returns pattern obj
if matchobj: # patt.search returns match obj
Regular Expression Pattern Matching | 1425