Chapter 3 ■ tCp
41
One question to ask, though, is whether a client might want to open a TCP connection and then use it over
several minutes or hours to make many separate requests to the same server. Once the connection is going and the
cost of the handshake had been paid, each actual request and response will require only a single packet in each
direction, which will benefit from all of TCP’s intelligence about retransmission, exponential backoff, and flow control.
Where UDP really shines, then, is where a long-term relationship will not exist between client and server,
especially where there are so many clients that a typical TCP implementation would run out of memory if it had to
keep up with a separate data stream for each active client.
The second situation where TCP is inappropriate is when an application can do something much smarter than
simply retransmit data when a packet has been lost. Imagine an audio chat conversation, for example. If a second’s
worth of data is lost because of a dropped packet, then it will do little good simply to resend that same second of
audio, over and over, until it finally arrives. Instead, the client should just fill that awkward second with whatever
audio it can piece together from the packets that did arrive (a clever audio protocol will begin and end each packet
with a bit of heavily compressed audio from the preceding and following moments of time to cover exactly this
situation) and then keep going after the interruption as though it did not occur. This is impossible with TCP, which
will keep stubbornly retransmitting the lost information even when it is far too old to be of any use. UDP datagrams
are often the foundation of live-streaming multimedia over the Internet.
What TCP Sockets Mean
As was the case with UDP in Chapter 2, TCP uses port numbers to distinguish different applications running at the
same IP address, and it follows exactly the same conventions regarding well-known and ephemeral port numbers.
Reread the section “Port Numbers” in that chapter if you want to review the details.
As you saw in the previous chapter, it takes only a single socket to speak UDP: a server can open a UDP port and
then receive datagrams from thousands of different clients. While it is certainly possible to connect() a datagram
socket to a particular peer so that the socket will always send() to only that peer and recv()packets sent back
from that peer, the idea of a connection is just a convenience. The effect of connect() is exactly the same as your
application simply deciding, on its own, to send to only one address with sendto() calls and then ignore responses
from any but that same address.
But with a stateful stream protocol like TCP, the connect() call becomes the opening step upon which all further
network communication hinges. It is the moment when your operating system’s network stack kicks off the handshake
protocol described in the previous section that, if successful, will make both ends of the TCP stream ready for use.
This means that a TCP connect(), unlike the same call on a UDP socket, can fail. The remote host might not
answer, or it might refuse the connection. Or more obscure protocol errors might occur, like the immediate receipt of
a RST (“reset”) packet. Because a stream connection involves setting up a persistent connection between two hosts,
the other host needs to be listening and ready to accept your connection.
On the “server side”—which, by definition, is the conversation partner not doing the connect() call but receiving
the SYN packet that the connect call initiates—an incoming connection generates an even more momentous event
for a Python application: the creation of a new socket! This is because the standard POSIX interface to TCP actually
involves two completely different kinds of sockets: “passive” listening sockets and active “connected” ones.
• The passive socket or listening socket maintains the “socket name”—the address and port
number—at which the server is ready to receive connections. No data can ever be received or
sent by this kind of socket. It does not represent any actual network conversation. Instead, it is
how the server alerts the operating system to its willingness to receive incoming connections
at a given TCP port number in the first place.
• An active, connected socket is bound to one particular remote conversation partner with a
particular IP address and port number. It can be used only for talking back and forth with that
one partner, and it can be read and written to without worrying about how the resulting data
will be split up into packets. The stream looks so much like a pipe or file that, on Unix systems,
a connected TCP socket can be passed to another program that expects to read from a normal
file, and that program will never even know that it is talking over the network.