[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
Recent changes: Be sure to also pass a string 'c' as a second argument
when calling dbm.open, to force Python to create the file if it does not yet
exist and to simply open it for reads and writes otherwise. This used to
be the default behavior but is no longer. You do not need the 'c' argu-
ment when opening shelves discussed ahead—they still use an “open or
create” 'c' mode by default if passed no open mode argument. Other
open mode strings can be passed to dbm, including n to always create the
file, and r for read-only of an existing file—the new default. See the
Python library manual for more details.
In addition, Python 3.X stores both key and value strings as bytes in-
stead of str as we’ve seen (which turns out to be convenient for pickled
data in shelves, discussed ahead) and no longer ships with bsddb as a
standard component—it’s available independently on the Web as a
third-party extension, but in its absence Python falls back on its own
DBM file implementation. Since the underlying DBM implementation
rules are prone to change with time, you should always consult Python’s
library manuals as well as the dbm module’s standard library source code
for more information.

Pickled Objects


Probably the biggest limitation of DBM keyed files is in what they can store: data stored
under a key must be a simple string. If you want to store Python objects in a DBM file,
you can sometimes manually convert them to and from strings on writes and reads
(e.g., with str and eval calls), but this takes you only so far. For arbitrarily complex
Python objects such as class instances and nested data structures, you need something
more. Class instance objects, for example, cannot usually be later re-created from their
standard string representations. Moreover, custom to-string conversions and from-
string parsers are error prone and not general.


The Python pickle module, a standard part of the Python system, provides the con-
version step needed. It’s a sort of super general data formatting and de-formatting
tool—pickle converts nearly arbitrary Python in-memory objects to and from a single
linear string format, suitable for storing in flat files, shipping across network sockets
between trusted sources, and so on. This conversion from object to string is often
called serialization—arbitrary data structures in memory are mapped to a serial string
form.


The string representation used for objects is also sometimes referred to as a byte stream,
due to its linear format. It retains all the content and references structure of the original
in-memory object. When the object is later re-created from its byte string, it will be a
new in-memory object identical in structure and value to the original, though located
at a different memory address.


The net effect is that the re-created object is effectively a copy of the original; in Python-
speak, the two will be == but not is. Since the recreation typically happens in an entirely


Pickled Objects | 1309
Free download pdf