Pickle Details: Protocols, Binary Modes, and _pickle
In later Python releases, the pickler introduced the notion of protocols—storage formats
for pickled data. Specify the desired protocol by passing an extra parameter to the
pickling calls (but not to unpickling calls: the protocol is automatically determined
from the pickled data):
pickle.dump(object, file, protocol) # or protocol=N keyword argumentPickled data may be created in either text or binary protocols; the binary protocols’
format is more efficient, but it cannot be readily understood if inspected. By default,
the storage protocol in Python 3.X is a 3.X-only binary bytes format (also known as
protocol 3). In text mode (protocol 0), the pickled data is printable ASCII text, which
can be read by humans (it’s essentially instructions for a stack machine), but it is still
a bytes object in Python 3.X. The alternative protocols (protocols 1 and 2) create the
pickled data in binary format as well.
For all protocols, pickled data is a bytes object in 3.X, not a str, and therefore implies
binary-mode reads and writes when stored in flat files (see Chapter 4 if you’ve forgotten
why). Similarly, we must use a bytes-oriented object when forging the file object’s
interface:
>>> import io, pickle
>>> pickle.dumps([1, 2, 3]) # default=binary protocol
b'\x80\x03]q\x00(K\x01K\x02K\x03e.'
>>> pickle.dumps([1, 2, 3], protocol=0) # ASCII format protocol
b'(lp0\nL1L\naL2L\naL3L\na.'>>> pickle.dump([1, 2, 3], open('temp','wb')) # same if protocol=0, ASCII
>>> pickle.dump([1, 2, 3], open('temp','w')) # must use 'rb' to read too
TypeError: must be str, not bytes
>>> pickle.dump([1, 2, 3], open('temp','w'), protocol=0)
TypeError: must be str, not bytes>>> B = io.BytesIO() # use bytes streams/buffers
>>> pickle.dump([1, 2, 3], B)
>>> B.getvalue()
b'\x80\x03]q\x00(K\x01K\x02K\x03e.'>>> B = io.BytesIO() # also bytes for ASCII
>>> pickle.dump([1, 2, 3], B, protocol=0)
>>> B.getvalue()
b'(lp0\nL1L\naL2L\naL3L\na.'>>> S = io.StringIO() # it's not a str anymore
>>> pickle.dump([1, 2, 3], S) # same if protocol=0, ASCII
TypeError: string argument expected, got 'bytes'
>>> pickle.dump([1, 2, 3], S, protocol=0)
TypeError: string argument expected, got 'bytes'1314 | Chapter 17: Databases and Persistence
