[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
spam
Spam
SPAM!

This works for text, but watch what happens when we try to render a message part
with truly binary data, such as an image that could not be decoded as Unicode text:


>>> from email.message import Message # generic Message object
>>> m = Message()
>>> m['From'] = '[email protected]'
>>> bytes = open('monkeys.jpg', 'rb').read() # read binary bytes (not Unicode)
>>> m.set_payload(bytes) # we set the payload to bytes
>>> print(m)
Traceback (most recent call last):
...lines omitted...
File "C:\Python31\lib\email\generator.py", line 155, in _handle_text
raise TypeError('string payload expected: %s' % type(payload))
TypeError: string payload expected: <class 'bytes'>

>>> m.get_payload()[:20]
b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x00x\x00x\x00\x00'

The problem here is that the email package’s text generator assumes that the message’s
payload data is a Base64 (or similar) encoded str text string by generation time, not
bytes. Really, the error is probably our fault in this case, because we set the payload to
raw bytes manually. We should use the MIMEImage MIME subclass tailored for images;
if we do, the email package internally performs Base64 MIME email encoding on the
data when the message object is created. Unfortunately, it still leaves it as bytes, not
str, despite the fact the whole point of Base64 is to change binary data to text (though
the exact Unicode flavor this text should take may be unclear). This leads to additional
failures in Python 3.1:


>>> from email.mime.image import MIMEImage # Message sublcass with hdrs+base64
>>> bytes = open('monkeys.jpg', 'rb').read() # read binary bytes again
>>> m = MIMEImage(bytes) # MIME class does Base64 on data
>>> print(m)
Traceback (most recent call last):
...lines omitted...
File "C:\Python31\lib\email\generator.py", line 155, in _handle_text
raise TypeError('string payload expected: %s' % type(payload))
TypeError: string payload expected: <class 'bytes'>

>>> m.get_payload()[:40] # this is already Base64 text
b'/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIB'

>>> m.get_payload()[:40].decode('ascii') # but it's still bytes internally!
'/9j/4AAQSkZJRgABAQEAeAB4AAD/2wBDAAIBAQIB'

In other words, not only does the Python 3.1 email package not fully support the Python
3.X Unicode/bytes dichotomy, it was actually broken by it. Luckily, there’s a work-
around for this case.


To address this specific issue, I opted to create a custom encoding function for binary
MIME attachments, and pass it in to the email package’s MIME message object


email: Parsing and Composing Mail Content | 939
Free download pdf