form (?:) can be used to group nested parts of a pattern without forming a saved
substring group (split treats groups specially):
>>> import re
>>> re.split('--', 'aaa--bbb--ccc')
['aaa', 'bbb', 'ccc']
>>> re.sub('--', '...', 'aaa--bbb--ccc') # single string case
'aaa...bbb...ccc'
>>> re.split('--|==', 'aaa--bbb==ccc') # split on -- or ==
['aaa', 'bbb', 'ccc']
>>> re.sub('--|==', '...', 'aaa--bbb==ccc') # replace -- or ==
'aaa...bbb...ccc'
>>> re.split('[-=]', 'aaa-bbb=ccc') # single char alternative
['aaa', 'bbb', 'ccc']
>>> re.split('(--)|(==)', 'aaa--bbb==ccc') # split includes groups
['aaa', '--', None, 'bbb', None, '==', 'ccc']
>>> re.split('(?:--)|(?:==)', 'aaa--bbb==ccc') # expr part, not group
['aaa', 'bbb', 'ccc']
Similarly, splits can extract simple substrings for fixed delimiters, but patterns can also
handle surrounding context like brackets, mark parts as optional, ignore whitespace,
and more. In the next tests \s means zero or more whitespace characters (a character
class); \s+ means one or more of the same; /? matches an optional slash; [a-z] is any
lowercase letter (a range);(.?) means a saved substring of zero or more of any character
again—but only as many as needed to match the rest of the pattern (nongreedily); and
the groups method is used to fetch the substrings matched by the parenthesized sub-
patterns all at once:
>>> 'spam/ham/eggs'.split('/')
['spam', 'ham', 'eggs']
>>> re.match('(.*)/(.*)/(.*)', 'spam/ham/eggs').groups()
('spam', 'ham', 'eggs')
>>> re.match('<(.*)>/<(.*)>/<(.*)>', '<spam>/<ham>/<eggs>').groups()
('spam', 'ham', 'eggs')
>>> re.match('\s*<(.*)>/?<(.*)>/?<(.*)>', ' <spam>/<ham><eggs>').groups()
('spam', 'ham', 'eggs')
>>> 'Hello pattern world!'.split()
['Hello', 'pattern', 'world!']
>>> re.match('Hello\s*([a-z]*)\s+(.*?)\s*!', 'Hellopattern world !').groups()
('pattern', 'world')
In fact, there’s more than one way to match. The findall method provides generality
that leaves string objects in the dust—it locates all occurrences of a pattern and returns
all the substrings it matched (or a list of tuples for multiple groups). The search method
is similar but stops at the first match—it’s like match plus an initial forward scan. In the
following, string object finds locate just one specific string, but patterns can be used to
Regular Expression Pattern Matching | 1419