[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

groups()
Returns a tuple of all groups’ substrings of the match (for group numbers 1 and
higher).


groupdict()
Returns a dictionary containing all named groups of the match (see (?PR)
syntax ahead).


start([group]) end([group])
Indices of the start and end of the substring matched by group (or the entire matched
string, if no group is passed).


span([group])
Returns the two-item tuple: (start(group), end(group)).


expand([template])
Performs backslash group substitutions; see the Python library manual.


Regular expression patterns


Regular expression strings are built up by concatenating single-character regular ex-
pression forms, shown in Table 19-1. The longest-matching string is usually matched
by each form, except for the nongreedy operators. In the table, R means any regular
expression form, C is a character, and N denotes a digit.


Table 19-1. re pattern syntax


Operator Interpretation

. Matches any character (including newline if DOTALL flag is specified or (?s) at pattern front)
^ Matches start of the string (of every line in MULTILINE mode)
$ Matches end of the string (of every line in MULTILINE mode)
C Any nonspecial (or backslash-escaped) character matches itself
R Zero or more of preceding regular expression R (as many as possible)
R+ One or more of preceding regular expression R (as many as possible)
R? Zero or one occurrence of preceding regular expression R (optional)
R{m} Matches exactly m copies preceding R: a{5} matches 'aaaaa'
R{m,n} Matches from m to n repetitions of preceding regular expression R
R
?, R+?, R??,
R{m,n}?


Same as *, +, and? but matches as few characters/times as possible; these are known as
nongreedy match operators (unlike others, they match and consume as few characters as
possible)
[...] Defines character set: e.g., [a-zA-Z] to match all letters (alternatives, with - for ranges)
[^...] Defines complemented character set: matches if char is not in set
\ Escapes special chars (e.g., *?+|()) and introduces special sequences in Table 19-2
\\ Matches a literal \ (write as \\\\ in pattern, or use r'\\')
\N Matches the contents of the group of the same number N: '(.+) \1' matches “42 42”

Regular Expression Pattern Matching | 1423
Free download pdf