Foundations of Python Network Programming

(WallPaper) #1

Chapter 5 ■ Network Data aND Network errors


86


Pickles and Self-delimiting Formats


Note that some kinds of data you might send across the network already include some form of built-in delimiting.
If you are transmitting such data, then you might not have to impose your own framing atop what the data is
already doing.
Consider Python “pickles,” for example, the native form of serialization that comes with the Standard Library.
Using a quirky mix of text commands and data, a pickle stores the contents of a Python data structure so that you can
reconstruct it later or on a different machine.





import pickle
pickle.dumps([5, 6, 7])
b'\x80\x03]q\x00(K\x05K\x06K\x07e.'





The interesting thing about this output data is the '.' character that you see at the end of the foregoing string.
It is the format’s way of marking the end of a pickle. Upon encountering it, the loader can stop and return the value
without reading any further. Thus, you can take the foregoing pickle, stick some ugly data on the end, and see that
loads() will completely ignore the extra data and give you the original list back.





pickle.loads(b'\x80\x03]q\x00(K\x05K\x06K\x07e.blahblahblah')
[5, 6, 7]





Of course, using loads() this way is not useful for network data since it does not tell you how many bytes it
processed in order to reload the pickle; you still do not know how much of the string is pickle data. But if you switch
to reading from a file and use the pickle load() function, then the file pointer will remain right at the end of the pickle
data, and you can start reading from there if you want to read what came after the pickle.





from io import BytesIO
f = BytesIO(b'\x80\x03]q\x00(K\x05K\x06K\x07e.blahblahblah')
pickle.load(f)
[5, 6, 7]
f.tell()
14
f.read()
b'blahblahblah'





Alternately, you could create a protocol that just consisted of sending pickles back and forth between two Python
programs. Note that you would not need the kind of loop that you put into the recvall() function in Listing 5-2
because the pickle library knows all about reading from files and how it might have to do repeated reads until an
entire pickle has been read. Use the makefile() socket method (discussed in Chapter 3) if you want to wrap a socket
in a Python file object for consumption by a routine like the pickle load() function.
Note that there are many subtleties involved in pickling large data structures, especially if they contain Python
objects beyond simple built-in types such as integers, strings, lists, and dictionaries. See the pickle module
documentation for more details.


XML and JSON


If your protocol needs to be usable from other programming languages or if you simply prefer universal standards
instead of formats specific to Python, then the JSON and XML data formats are each a popular choice. Note that
neither of these formats supports framing, so you will first have to figure out how to extract a complete string of text
from over the network before you can then process it.

Free download pdf