>>> s = b'A\xe4B'
>>> s.decode('latin1')
'AäB'
>>> from email.message import Message
>>> m = Message()
>>> m.set_payload(b'A\xe4B', charset='latin1') # or 'latin-1': see ahead
>>> t = m.as_string()
>>> print(t)
MIME-Version: 1.0
Content-Type: text/plain; charset="latin1"
Content-Transfer-Encoding: base64
QeRC
>>> m.get_content_charset()
'latin1'
Notice how email automatically applies Base64 MIME encoding to non-ASCII text
parts on generation, to conform to email standards. The same is true for the more
specific MIME text subclass of Message:
>>> from email.mime.text import MIMEText
>>> m = MIMEText(b'A\xe4B', _charset='latin1')
>>> t = m.as_string()
>>> print(t)
Content-Type: text/plain; charset="latin1"
MIME-Version: 1.0
Content-Transfer-Encoding: base64
QeRC
>>> m.get_content_charset()
'latin1'
Now, if we parse this message’s text string with email, we get back a new Message whose
text payload is the Base64 MIME-encoded text used to represent the non-ASCII Uni-
code string. Requesting MIME decoding for the payload with decode=1 returns the byte
string we originally attached:
>>> from email.parser import Parser
>>> q = Parser().parsestr(t)
>>> q
<email.message.Message object at 0x019ECA50>
>>> q.get_content_type()
'text/plain'
>>> q._payload
'QeRC\n'
>>> q.get_payload()
'QeRC\n'
>>> q.get_payload(decode=1)
b'A\xe4B'
However, running Unicode decoding on this byte string to convert to text fails if we
attempt to use the platform default on Windows (UTF8). To be more accurate, and
932 | Chapter 13: Client-Side Scripting