Chapter 5 ■ Network Data aND Network errors
81
In summary, here is my advice for preparing binary data for transmission across a network socket:
• Use the struct module to produce binary data for transmission on the network and to unpack
it upon arrival.
• Select network byte order with the '!' prefix if you control the data format.
• If someone else has designed the protocol and specified little-endian, then you will have to
use '<' instead.
Always test struct to see how it lays out your data compared to the specification for the protocol you are
speaking; note that 'x' characters in the packing format string can be used to insert padding bytes.
You might see older Python code use a cadre of awkwardly named functions from the socket module in order
to turn integers into byte strings in network order. These functions have names like ntohl() and htons(), and
they correspond to functions of the same name in the POSIX networking library, which also supplies calls such as
socket() and bind(). I suggest you ignore these awkward functions and use the struct module instead; it is more
flexible, it is more general, and it produces more readable code.
Framing and Quoting
If you are using UDP datagrams for communication, then the protocol itself will deliver your data in discrete and
identifiable chunks. However, you will have to reorder and retransmit those chunks yourself if anything goes wrong on
the network, as outlined in Chapter 2.
Nevertheless, if you have chosen the far more common option of using a TCP stream for communication, then
you will face the issue of framing—of how to delimit your messages so that the receiver can tell where one message
ends and the next one begins. Since the data you supply to sendall() might be broken up into several packets for
actual transmission on the network, the program that receives your message might have to make several recv() calls
before your whole message has been read—or it might not, if all the packets arrive by the time the operating system
has the chance to schedule the process again!
The issue of framing asks the question: when is it safe for the receiver finally to stop calling recv() because an
entire message or datum has arrived intact and complete, and it can now be interpreted or acted upon as a whole?
As you might imagine, there are several approaches.
First, there is a pattern that can be used by extremely simple network protocols that involve only the delivery
of data—no response is expected, so there never has to come a time when the receiver decides “Enough!” and
turns around to send a response. In this case, the sender can loop until all of the outgoing data has been passed to
sendall() and then close() the socket. The receiver need only call recv() repeatedly until the call finally returns an
empty string indicating that the sender has finally closed the socket. You can see this pattern in Listing 5-1.
Listing 5-1. Simply Send All Data and Then Close the Connection
#!/usr/bin/env python3
Foundations of Python Network Programming, Third Edition
https://github.com/brandon-rhodes/fopnp/blob/m/py3/chapter05/streamer.py
Client that sends data then closes the socket, not expecting a reply.
import socket
from argparse import ArgumentParser
def server(address):
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
sock.bind(address)
sock.listen(1)
print('Run this script in another window with "-c" to connect')