structure to discussions carried out over sockets. Let’s briefly look at each of these layers
in the abstract before jumping into programming details.
The Socket Layer
In simple terms, sockets are a programmable interface to connections between pro-
grams, possibly running on different computers of a network. They allow data format-
ted as byte strings to be passed between processes and machines. Sockets also form the
basis and low-level “plumbing” of the Internet itself: all of the familiar higher-level Net
protocols, like FTP, web pages, and email, ultimately occur over sockets. Sockets are
also sometimes called communications endpoints because they are the portals through
which programs send and receive bytes during a conversation.
Although often used for network conversations, sockets may also be used as a com-
munication mechanism between programs running on the same computer, taking the
form of a general Inter-Process Communication (IPC) mechanism. We saw this socket
usage mode briefly in Chapter 5. Unlike some IPC devices, sockets are bidirectional
data streams: programs may both send and receive data through them.
To programmers, sockets take the form of a handful of calls available in a library. These
socket calls know how to send bytes between machines, using lower-level operations
such as the TCP network transmission control protocol. At the bottom, TCP knows
how to transfer bytes, but it doesn’t care what those bytes mean. For the purposes of
this text, we will generally ignore how bytes sent to sockets are physically transferred.
To understand sockets fully, though, we need to know a bit about how computers are
named.
Machine identifiers
Suppose for just a moment that you wish to have a telephone conversation with some-
one halfway across the world. In the real world, you would probably need either that
person’s telephone number or a directory that you could use to look up the number
from her name (e.g., a telephone book). The same is true on the Internet: before a script
can have a conversation with another computer somewhere in cyberspace, it must first
know that other computer’s number or name.
Luckily, the Internet defines standard ways to name both a remote machine and a
service provided by that machine. Within a script, the computer program to be con-
tacted through a socket is identified by supplying a pair of values—the machine name
and a specific port number on that machine:
Machine names
A machine name may take the form of either a string of numbers separated by dots,
called an IP address (e.g., 166.93.218.100), or a more legible form known as a
domain name (e.g., starship.python.net). Domain names are automatically mapped
into their dotted numeric address equivalent when used, by something called a
Plumbing the Internet | 781