[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
>>> import string
>>> template = string.Template('---$key1---$key2---')
>>> template.substitute(vals)
'---Spam---shrubbery---'

>>> template.substitute(key1='Brian', key2='Loretta')
'---Brian---Loretta---'

See the library manual for more on this extension. Although the string datatype does
not itself support the pattern-directed text processing that we’ll meet later in this chap-
ter, its tools are powerful enough for many tasks.


Parsing with Splits and Joins


In terms of this chapter’s main focus, Python’s built-in tools for splitting and joining
strings around tokens turn out to be especially useful when it comes to parsing text:


str.split(delimiter?, maxsplits?)
Splits a string into a list of substrings, using either whitespace (tabs, spaces, new-
lines) or an explicitly passed string as a delimiter. maxsplits limits the number of
splits performed, if passed.


delimiter.join(iterable)
Concatenates a sequence or other iterable of substrings (e.g., list, tuple, generator),
adding the subject separator string between each.


These two are among the most powerful of string methods. As we saw in Chapter 2,
split chops a string into a list of substrings and join puts them back together:


>>> 'A B C D'.split()
['A', 'B', 'C', 'D']
>>> 'A+B+C+D'.split('+')
['A', 'B', 'C', 'D']
>>> '--'.join(['a', 'b', 'c'])
'a--b--c'

Despite their simplicity, they can handle surprisingly complex text-parsing tasks.
Moreover, string method calls are very fast because they are implemented in C language
code. For instance, to quickly replace all tabs in a file with four periods, pipe the file
into a script that looks like this:


from sys import *
stdout.write(('.' * 4).join(stdin.read().split('\t')))

The split call here divides input around tabs, and the join puts it back together with
periods where tabs had been. In this case, the combination of the two calls is equivalent
to using the simpler global replacement string method call as follows:


stdout.write(stdin.read().replace('\t', '.' * 4))

As we’ll see in the next section, splitting strings is sufficient for many text-parsing goals.


String Method Utilities | 1409
Free download pdf