[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
Operator Interpretation
R|R Alternative: matches left or right R
RR Concatenation: match both Rs
(R) Matches any regular expression inside (), and delimits a group (retains matched substring)
(?:R) Same as (R) but simply delimits part R and does not denote a saved group
(?=R) Look-ahead assertion: matches if R matches next, but doesn’t consume any of the string (e.g.,
X (?=Y) matches X only if followed by Y)
(?!R) Matches if R doesn’t match next; negative of (?=R)
(?P<name>R) Matches any regular expression inside (), and delimits a named group
(?P=name) Matches whatever text was matched by the earlier group named name
(?#...) A comment; ignored
(?letter) Set mode flag; letter is one of aiLmsux (see the library manual)
(?<=R) Look-behind assertion: matches if the current position in the string is preceded by a match
of R that ends at the current position
(?<!R) Matches if the current position in the string is not preceded by a match for R; negative of
(?<= R)
(?(id/
name)yespattern|
nopattern)

Will try to match with yespattern if the group with given id or name exists, else with
optional nopattern

Within patterns, ranges and selections can be combined. For instance, [a-zA-Z0-9_]+
matches the longest possible string of one or more letters, digits, or underscores. Special
characters can be escaped as usual in Python strings: [\t ]* matches zero or more tabs
and spaces (i.e., it skips such whitespace).


The parenthesized grouping construct, (R), lets you extract matched substrings after a
successful match. The portion of the string matched by the expression in parentheses
is retained in a numbered register. It’s available through the group method of a match
object after a successful match.


In addition to the entries in this table, special sequences in Table 19-2 can be used in
patterns, too. Because of Python string rules, you sometimes must double up on back-
slashes (\) or use Python raw strings (r'...') to retain backslashes in the pattern
verbatim. Python ignores backslashes in normal strings if the letter following the back-
slash is not recognized as an escape code. Some of the entries in Table 19-2 are affected
by Unicode when matching str instead of bytes, and an ASCII flag may be set to emulate
the behavior for bytes; see Python’s manuals for more details.


1424 | Chapter 19: Text and Language

Free download pdf