Chapter 12 ■ Building and parsing e-Mail
239
Parsing Dates
Standards-compliant dates were used in the scripts above through the formatdate() function in email.utils, which
uses the current date and time by default. But they can also be provided with a low-level Unix timestamp. If you are
doing higher-level date manipulation and have generated a datetime object, simply use the format_datetime()
function instead to do the same kind of formatting.
When parsing an e-mail, you can perform the inverse operation through three other methods inside
email.utils.
• Both parsedate() and parsedate_tz() return time tuples of the sort that Python supports at
a low level through its time module following the old C-language conventions for doing date
arithmetic and representation.
• The modern parsedate_to_datetime() function instead returns a full datetime object, and it
is probably the call you will want to make in most production code.
Note that many e-mail programs fail to follow exactly the relevant standards when writing Date headers, and
although these routines try to be forgiving, there may be circumstances in which they cannot produce a valid date
value and return None instead. You will want to check for this value before assuming that you have been given back a
date. A few example calls follow.
from email import utils
utils.parsedate('Tue, 25 Mar 2014 17:14:01 -0400')
(2014, 3, 25, 17, 14, 1, 0, 1, -1)
utils.parsedate_tz('Tue, 25 Mar 2014 17:14:01 -0400')
(2014, 3, 25, 17, 14, 1, 0, 1, -1, -14400)
utils.parsedate_to_datetime('Tue, 25 Mar 2014 17:14:01 -0400')
datetime.datetime(2014, 3, 25, 17, 14, 1,
tzinfo=datetime.timezone(datetime.timedelta(-1, 72000)))
If you are going to be doing any arithmetic with dates, I strongly suggest that you investigate the third-party pytz
module, which has become a community best practice around date manipulation.
Summary
The powerful email.message.EmailMessage class introduced into Python 3.4 by R. David Murray makes both the
generation and consumption of MIME messages much more convenient than in previous versions of Python. As
usual, the only caution is to pay close attention to the distinction between bytes and strings. Try to do your entire
socket or file I/O as bytes, and let the email module do all of its own encoding so that every step is done correctly.
An e-mail is typically generated by instantiating EmailMessage and then specifying headers and content. Headers
are set by treating the message as a dictionary with case-insensitive string keys, where the string values are stored that
will be properly encoded upon output if any of their characters are non-ASCII. Content is set through a cascade of four
methods—set_content(), add_related(), add_alternative(), and add_attachment()—that handle both text and
bytes payloads correctly in all cases.
An e-mail message can be read back in and examined as an EmailMessage object by running any of the email
module’s parsing functions (message_from_binary_file() is the approach used in the listings in this chapter) with a
policy argument turning on all of the modern features of the EmailMessage class. Each resulting object will either be a
multipart with further subparts inside of it or a bare piece of content that Python returns as a string or as bytes data.
Headers are automatically internationalized and decoded on output and input. The special Date header’s
format is supported by methods in email.utils that let your code both read and write its value using instances of the
modern Python datetime object.
The next chapter will look specifically at the use of the SMTP protocol for e-mail transmission.