Foundations of Python Network Programming

(WallPaper) #1
Chapter 13 ■ SMtp

247

This mechanism also helps support e-mailing lists, so that an e-mail whose To line says [email protected]
can actually be delivered, without rewritten headers, to the dozens or hundreds of people who subscribe to that list
without exposing all of their e-mail addresses to every reader of the list.
So, as you read the following descriptions of SMTP, keep reminding yourself that the headers-plus-body that
make up the e-mail message itself are separate from the “envelope sender” and “envelope recipient” that will be
mentioned in the protocol descriptions. Yes, it is true that your e-mail client, whether you are using /usr/sbin/
sendmail or Thunderbird or Google Mail, probably asked you for the recipient’s e-mail address only once; but it
then proceeded to use it in two different places: once in the To header at the top of the message itself and then again
“outside” of the message when it spoke SMTP in order to send the e-mail on its way.


Multiple Hops

Once upon a time, e-mail often traveled over only one SMTP “hop” between the mainframe on which it was
composed to the machine on whose disk the recipient’s in box was stored. These days, messages often travel through
a half-dozen servers or more before reaching their destination. This means that the SMTP envelope recipient,
described in the previous section, repeatedly changes as the message nears its destination.
An example should make this clear. Several of the following details are fictitious, but they should give you a good
idea of how messages actually traverse the Internet.
Imagine a worker in the central IT organization at Georgia Tech who tells his friend that his e-mail address is
[email protected]. When the friend later sends him a message, the friend’s e-mail provider will look up the domain
gatech.edu in the Domain Name Service (DNS; see Chapter 4), receive a series of MX records in reply, and connects
to one of those IP address to deliver the message. Simple enough, right?
But the server for gatech.edu serves an entire campus! To find out where brandon is, it consults a table, finds his
department, and learns that his official e-mail address is actually:


[email protected]


So the gatech.edu server in turn does a DNS lookup of oit.gatech.edu and then uses SMTP—the message’s
second SMTP hop, if you are counting—to send the message to the e-mail server for OIT, the Office of Information
Technology.
But OIT long ago abandoned their single-server solution that used to keep all of their e-mail on a single Unix
server. Instead, they now run a sophisticated e-mail solution that users can access through webmail, POP, and IMAP.
Incoming e-mail arriving at oit.gatech.edu is first sent randomly to one of several spam-filtering servers (third hop),
say the server named spam3.oit.gatech.edu. Then, if it survives the spam check and is not discarded, it is handed off
randomly to one of eight redundant e-mail servers, and so after the fourth hop, the message is in the queue on mail7.
oit.gatech.edu.
The routing servers, like mail7, can then query a central directory service to determine which back-end mail
stores, connected to large disk arrays, host which users’ mailboxes. So mail7 does an LDAP lookup for brandon.
rhodes, concludes that his e-mail lives on the anvil.oit.gatech.edu server, and in a fifth and final SMTP hop, the
e-mail is delivered to anvil and is written to its redundant disk array.
That is why e-mail often takes at least a few seconds to traverse the Internet: large organizations and big ISPs tend
to have several levels of servers that a message must negotiate before its delivery.
How can you investigate an e-mail’s route? It was emphasized previously that the SMTP protocol does not read
e-mail headers, but it has its own idea about where a message should be going—which, as you have just seen, can
change with every hop that a message takes toward its destination. But it turns out that e-mail servers are encouraged
to add new headers, precisely to keep track of a message’s circuitous route from its origin to its destination.
These headers are called Received headers, and they are a gold mine for confused system administrators trying
to debug problems with their e-mail systems. Take a look at any e-mail message and ask your e-mail client to display
all of the headers. You should be able to see every step that the message took toward its destination. (Spammers often
write several fictitious Received headers at the top of their messages to make it look like the message originated from
a reputable organization.) Finally, there is probably a Delivered-to header that is written when the last server in the
chain is finally able to write the message triumphantly to physical storage in someone’s mailbox.

Free download pdf