[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
CHAPTER 19

Text and Language

“See Jack Hack. Hack, Jack, Hack”


In one form or another, processing text-based information is one of the more common
tasks that applications need to perform. This can include anything from scanning a text
file by columns to analyzing statements in a language defined by a formal grammar.
Such processing usually is called parsing—analyzing the structure of a text string. In
this chapter, we’ll explore ways to handle language and text-based information and
summarize some Python development concepts in sidebars along the way. In the proc-
ess, we’ll meet string methods, text pattern matching, XML and HTML parsers, and
other tools.


Some of this material is advanced, but the examples are small to keep this chapter short.
For instance, recursive descent parsing is illustrated with a simple example to show
how it can be implemented in Python. We’ll also see that it’s often unnecessary to write
custom parsers for each language processing task in Python. They can usually be re-
placed by exporting APIs for use in Python programs, and sometimes by a single built-
in function call. Finally, this chapter closes by presenting PyCalc—a calculator GUI
written in Python, and the last major Python coding example in this text. As we’ll see,
writing calculators isn’t much more difficult than juggling stacks while scanning text.


Strategies for Processing Text in Python


In the grand scheme of things, there are a variety of ways to handle text processing and
language analysis in Python:


Expressions
Built-in string object expressions


Methods
Built-in string object method calls


Patterns
Regular expression pattern matching


1405
Free download pdf