Foundations of Python Network Programming

(WallPaper) #1
Chapter 3 ■ tCp

47

But you will see a big difference if you bring up the server, run the client against it, and then try killing and
rerunning the server. When the server starts back up, you will get an error:


$ python tcp_sixteen.py server
Traceback (most recent call last):
...
OSError: [Errno 98] Address already in use


How mysterious! Why would a bind() that can be repeated over and over again suddenly become
impossible merely because a client has connected? If you keep trying to run the server without the SO_REUSEADDR
option, you will find that the address does not become available again until several minutes after your last client
connection.
The reason for this restriction is extreme caution on the part of your operating system’s network stack. A server
socket that is merely listening can immediately be shut down and forgotten. But a connected TCP socket, which is
actually talking to a client, cannot immediately disappear even though both client and server may have closed their
connection and sent FIN packets in each direction. Why? Because even after the network stack sends the last packet
shutting the socket down, it has no way ever to be sure that it was received. If it happens to have been dropped by the
network, then the remote end might at any moment wonder what is taking the last packet so long and retransmit its
FIN packet in the hope of finally receiving an answer.
A reliable protocol like TCP obviously has to have some point like this where it stops talking; some final packet
must, logically, be left hanging with no acknowledgment, or systems would have to commit to an endless exchange
of “Okay, we both agree that we are all done, right?” messages until the machines were finally powered off. Yet even
the final packet might get lost and need to be retransmitted a few times before the other end finally receives it. What
is the solution?
The answer is that once a connected TCP connection is finally closed from the point of view of your application,
the operating system’s network stack actually keeps a record of it around for up to four minutes in a waiting state. The
RFC names these states CLOSE-WAIT and TIME-WAIT. While the closed socket is still in either of these states, any
final FIN packets can be properly replied to. If the TCP implementation were just to forget about the connection, then
it could not reply to the FIN with a proper ACK.
So, a server that tries claiming a port on which a live connection was running within the last few minutes
is, really, trying to claim a port that is in some sense still in use. That is why you are returned an error if you try a
bind() to that address. By specifying the socket option SO_REUSEADDR, you are indicating that your application is
okay about owning a port whose old connections might still be shutting down out on some client on the network.
In practice, I always use SO_REUSEADDR when writing server code and have never had any problems.


Binding to Interfaces


As was explained in Chapter 2 when I discussed UDP, the IP address that you pair with a port number when you
perform a bind() operation tells the operating system what are the network interfaces from which you are willing to
receive connections. The example invocations of Listing 3-1 used the local IP address 127.0.0.1, which protects your
code from connections originating on other machines.

Free download pdf