[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

>>> parts = decode_header(S3) >>> ' '.join(abytes.decode('raw-unicode-escape' if enc == None else enc) ... for (abytes, enc) in parts) 'Man where did you get that assistant?'

We’ll use logic similar to the last step here in the mailtools package ahead, but also
retain str substrings intact without attempting to decode.

Late-breaking news: As I write this in mid-2010, it seems possible that this mixed type, nonpolymorphic, and frankly, non-Pythonic API be- havior may be addressed in a future Python release. In response to a rant posted on the Python developers list by a book author whose work you might be familiar with, there is presently a vigorous discussion of the topic there. Among other ideas is a proposal for a bytes-like type which carries with it an explicit Unicode encoding; this may make it possible to treat some text cases in a more generic fashion. While it’s impossible to foresee the outcome of such proposals, it’s good to see that the issues are being actively explored. Stay tuned to this book’s website for further developments in the Python 3.X library API and Unicode stories.

Message address header encodings and parsing, and header creation

One wrinkle pertaining to the prior section: for message headers that contain email
addresses (e.g., From), the name component of the name/address pair might be encoded
this way as well. Because the email package’s header parser expects encoded substrings
to be followed by whitespace or the end of string, we cannot ask it to decode a complete
address-related header—quotes around name components will fail.

To support such Internationalized address headers, we must also parse out the first
part of the email address and then decode. First of all, we need to extract the name and
address parts of an email address using email package tools:

>>> from email.utils import parseaddr, formataddr >>> p = parseaddr('"Smith, Bob" <[email protected]>') # split into name/addr pair >>> p # unencoded addr ('Smith, Bob', '[email protected]') >>> formataddr(p) '"Smith, Bob" <[email protected]>'

>>> parseaddr('Bob Smith <[email protected]>') # unquoted name part ('Bob Smith', '[email protected]') >>> formataddr(parseaddr('Bob Smith <[email protected]>')) 'Bob Smith <[email protected]>'

>>> parseaddr('[email protected]') # simple, no name ('', '[email protected]') >>> formataddr(parseaddr('[email protected]')) '[email protected]'

Fields with multiple addresses (e.g., To) separate individual addresses by commas.
Since email names might embed commas, too, blindly splitting on commas to run each

email: Parsing and Composing Mail Content | 935

[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

Get our desktop app

Company

Features

Documentation

Resources