Some protocols may define the contents of messages sent over sockets; others may
specify the sequence of control messages exchanged during conversations. By defining
regular patterns of communication, protocols make communication more robust. They
can also minimize deadlock conditions—machines waiting for messages that never
arrive.
For example, the FTP protocol prevents deadlock by conversing over two sockets: one
for control messages only and one to transfer file data. An FTP server listens for control
messages (e.g., “send me a file”) on one port, and transfers file data over another. FTP
clients open socket connections to the server machine’s control port, send requests,
and send or receive file data over a socket connected to a data port on the server ma-
chine. FTP also defines standard message structures passed between client and server.
The control message used to request a file, for instance, must follow a standard format.
Python’s Internet Library Modules
If all of this sounds horribly complex, cheer up: Python’s standard protocol modules
handle all the details. For example, the Python library’s ftplib module manages all the
socket and message-level handshaking implied by the FTP protocol. Scripts that import
ftplib have access to a much higher-level interface for FTPing files and can be largely
ignorant of both the underlying FTP protocol and the sockets over which it runs.‡
In fact, each supported protocol is represented in Python’s standard library by either a
module package of the same name as the protocol or by a module file with a name of
the form xxxlib.py, where xxx is replaced by the protocol’s name. The last column in
Table 12-1 gives the module name for some standard protocol modules. For instance,
FTP is supported by the module file ftplib.py and HTTP by package http.*. Moreover,
within the protocol modules, the top-level interface object is often the name of the
protocol. So, for instance, to start an FTP session in a Python script, you run import
ftplib and pass appropriate parameters in a call to ftplib.FTP; for Telnet, create a
telnetlib.Telnet instance.
In addition to the protocol implementation modules in Table 12-1, Python’s standard
library also contains modules for fetching replies from web servers for a web page
request (urllib.request), parsing and handling data once it has been transferred over
sockets or protocols (html.parser, the email. and xml. packages), and more.
Table 12-2 lists some of the more commonly used modules in this category.
‡ Since Python is an open source system, you can read the source code of the ftplib module if you are curious
about how the underlying protocol actually works. See the ftplib.py file in the standard source library directory
in your machine. Its code is complex (since it must format messages and manage two sockets), but with the
other standard Internet protocol modules, it is a good example of low-level socket programming.
Plumbing the Internet | 785