split(pattern, string [, maxsplit, flags])
Split string by occurrences of pattern. If capturing parentheses (()) are used in the
pattern, the text of all groups in the pattern are also returned in the resulting list.
sub(pattern, repl, string [, count, flags])
Return the string obtained by replacing the (first count) leftmost nonoverlapping
occurrences of pattern (a string or a pattern object) in string by repl (which may
be a string with backslash escapes that may back-reference a matched group, or a
function that is passed a single match object and returns the replacement string).
subn(pattern, repl, string [, count, flags])
Same as sub, but returns a tuple: (new-string, number-of-substitutions-made).
escape(string)
Return string with all nonalphanumeric characters backslashed, such that they
can be compiled as a string literal.
Compiled pattern objects
At the next level, pattern objects provide similar attributes, but the pattern string is
implied. The re.compile function in the previous section is useful to optimize patterns
that may be matched more than once (compiled patterns match faster). Pattern objects
returned by re.compile have these sorts of attributes:
match(string [, pos] [, endpos])
search(string [, pos] [, endpos])
findall(string [, pos [, endpos]])
finditer(string [, pos [, endpos]])
split(string [, maxsplit])
sub(repl, string [, count])
subn(repl, string [, count])
These are the same as the re module functions, but the pattern is implied, and pos and
endpos give start/end string indexes for the match.
Match objects
Finally, when a match or search function or method is successful, you get back a match
object (None comes back on failed matches). Match objects export a set of attributes of
their own, including:
group(g)
group(g1, g2, ...)
Return the substring that matched a parenthesized group (or groups) in the pattern.
Accept group numbers or names. Group numbers start at 1; group 0 is the entire
string matched by the pattern. Returns a tuple when passed multiple group num-
bers, and group number defaults to 0 if omitted.
1422 | Chapter 19: Text and Language