Foundations of Python Network Programming

(WallPaper) #1
Chapter 5 ■ Network Data aND Network errors

83

Note the nicety that since this socket is not intended to receive any data, the client and server both go ahead and
shut down communication in the direction they do not plan on using. This prevents any accidental use of the socket
in the other direction—use that could eventually queue up enough unread data to produce a deadlock, as you saw in
Listing 3-2 in Chapter 3. It is really necessary only for either the client or the server to call shutdown() on the socket,
but doing so from both directions provides both symmetry and redundancy.
A second pattern is a variant on the first: streaming in both directions. The socket is initially left open in both
directions. First, data is streamed in one direction—exactly as shown in Listing 5-1—and then that one direction is
shut down. Second, data is then streamed in the other direction, and the socket is finally closed. Again, Listing 3-2
from Chapter 3 illustrates an important warning: always finish the data transfer in one direction before turning around
to stream data back in the other, or you could produce a client and server that are deadlocked.
A third pattern, which was also illustrated in Chapter 3, is to use fixed-length messages, as illustrated in
Listing 3-1. You can use the Python sendall() method to transmit your byte string and then use a recv() loop of
your own devising to make sure that you receive the whole message.


def recvall(sock, length):
data = ''
while len(data) < length:
more = sock.recv(length - len(data))
if not more:
raise EOFError('socket closed {} bytes into a {}-byte'
' message'.format(len(data), length))
data += more
return data


Fixed-length messages are a bit rare since so little data these days seems to fit within static boundaries. However,
when transmitting binary data in particular (think of a struct format that always produces data blocks of the same
length, for example), you might find it to be a good fit for certain situations.
A fourth pattern is somehow to delimit your messages with special characters. The receiver would wait in a
recv() loop like the one just shown but not exit the loop until the reply string it was accumulating finally contained
the delimiter indicating the end-of-message. If the bytes or characters in the message are guaranteed to fall within
some limited range, then the obvious choice is to end each message with a symbol chosen from outside that range. If
you were sending ASCII strings, for example, you might choose the null character '\0' as the delimiter or a character
entirely outside the range of ASCII like '\xff'.
If instead the message can include arbitrary data, then using a delimiter is a problem: what if the character you are
trying to use as the delimiter turns up as part of the data? The answer, of course, is quoting—just like having to represent
a single-quote character as \' in the middle of a Python string that is itself delimited by single-quote characters.


'All\'s well that ends well.'


Nevertheless, I recommend using a delimiter scheme only where your message alphabet is constrained; it is
usually too much trouble to implement correct quoting and unquoting if you have to handle arbitrary data. For one
thing, your test for whether the delimiter has arrived now has to make sure that you are not confusing a quoted delimiter
for a real one that actually ends the message. A second complexity is that you then have to make a pass over the message
to remove the quote characters that were protecting literal occurrences of the delimiter. Finally, it means that message
length cannot be measured until you have performed decoding; a message of length 400 could be 400 symbols long, or
it could be 200 instances of the delimiter accompanied by the quoting character, or anything in between.
A fifth pattern is to prefix each message with its length. This is a popular choice for high-performance protocols
since blocks of binary data can be sent verbatim without having to be analyzed, quoted, or interpolated. Of course,
the length itself has to be framed using one of the techniques given previously—often the length is given as a simple
fixed-width binary integer or else a variable-length decimal string followed by a textual delimiter. Either way, once the
length has been read and decoded, the receiver can enter a loop and call recv() repeatedly until the whole message
has arrived. The loop can look exactly like the one in Listing 3-1, but with a length variable in place of the number 16.

Free download pdf