[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

might make their way into 3.2, the current sense is that fully addressing the package’s
problems appears to require a full redesign.


To be fair, it’s a substantial problem. Email has historically been oriented toward single-
byte ASCII text, and generalizing it for Unicode is difficult to do well. In fact, the same
holds true for most of the Internet today—as discussed elsewhere in this chapter, FTP,
POP, SMTP, and even webpage bytes fetched over HTTP pose the same sorts of issues.
Interpreting the bytes shipped over networks as text is easy if the mapping is one-to-
one, but allowing for arbitrary Unicode encoding in that text opens a Pandora’s box of
dilemmas. The extra complexity is necessary today, but, as email attests, can be a
daunting task.


Frankly, I considered not releasing this edition of this book until this package’s issues
could be resolved, but I decided to go forward because a new email package may be
years away (two Python releases, by all accounts). Moreover, the issues serve as a case
study of the types of problems you’ll run into in the real world of large-scale software
development. Things change over time, and program code is no exception.


Instead, this book’s examples provide new Unicode and Internationalization support
but adopt policies to work around issues where possible. Programs in books are meant
to be educational, after all, not commercially viable. Given the state of the email package
that the examples depend on, though, the solutions used here might not be completely
universal, and there may be additional Unicode issues lurking. To address the future,
watch this book’s website (described in the Preface) for updated notes and code ex-
amples if/when the anticipated new email package appears. Here, we’ll work with what
we have.


The good news is that we’ll be able to make use of email in its current form to build
fairly sophisticated and full-featured email clients in this book anyhow. It still offers an
amazing number of tools, including MIME encoding and decoding, message formatting
and parsing, Internationalized headers extraction and construction, and more. The bad
news is that this will require a handful of obscure workarounds and may need to be
changed in the future, though few software projects are exempt from such realities.


Because email’s limitations have implications for later email code in this book, I’m
going to quickly run through them in this section. Some of this can be safely saved for
later reference, but parts of later examples may be difficult to understand if you don’t
have this background. The upside is that exploring the package’s limitations here also
serves as a vehicle for digging a bit deeper into the email package’s interfaces in general.


Parser decoding requirement


The first Unicode issue in Python3.1’s email package is nearly a showstopper in some
contexts: the bytes strings of the sort produced by poplib for mail fetches must be
decoded to str prior to parsing with email. Unfortunately, because there may not be
enough information to know how to decode the message bytes per Unicode, some
clients of this package may need to be generalized to detect whole-message encodings


email: Parsing and Composing Mail Content | 927
Free download pdf