>>> open('temp.bin', 'w').write(data.decode())
8
>>> open('temp.bin', 'rb').read() # text mode write: added \r!
b'a\x00b\rc\r\r\nd'
>>> open('temp.bin', 'r').read() # again drops, alters \r on input
'a\x00b\nc\n\nd'
The short story to remember here is that you should generally use \n to refer to end-
line in all your text file content, and you should always open binary data in binary file
modes to suppress both end-of-line translations and any Unicode encodings. A file’s
content generally determines its open mode, and file open modes usually process file
content exactly as we want.
Keep in mind, though, that you might also need to use binary file modes for text in
special contexts. For instance, in Chapter 6’s examples, we’ll sometimes open text files
in binary mode to avoid possible Unicode decoding errors, for files generated on arbi-
trary platforms that may have been encoded in arbitrary ways. Doing so avoids encod-
ing errors, but also can mean that some text might not work as expected—searches
might not always be accurate when applied to such raw text, since the search key must
be in bytes string formatted and encoded according to a specific and possibly incom-
patible encoding scheme.
In Chapter 11’s PyEdit, we’ll also need to catch Unicode exceptions in a “grep” direc-
tory file search utility, and we’ll go further to allow Unicode encodings to be specified
for file content across entire trees. Moreover, a script that attempts to translate between
different platforms’ end-of-line character conventions explicitly may need to read text
in binary mode to retain the original line-end representation truly present in the file; in
text mode, they would already be translated to \n by the time they reached the script.
It’s also possible to disable or further tailor end-of-line translations in text mode with
additional open arguments we will finesse here. See the newline argument in open ref-
erence documentation for details; in short, passing an empty string to this argument
also prevents line-end translation but retains other text-mode behavior. For this chap-
ter, let’s turn next to two common use cases for binary data files: packed binary data
and random access.
Parsing packed binary data with the struct module
By using the letter b in the open call, you can open binary datafiles in a platform-neutral
way and read and write their content with normal file object methods. But how do you
process binary data once it has been read? It will be returned to your script as a simple
string of bytes, most of which are probably not printable characters.
If you just need to pass binary data along to another file or program, your work is
done—for instance, simply pass the byte string to another file opened in binary mode.
And if you just need to extract a number of bytes from a specific position, string slicing
will do the job; you can even follow up with bitwise operations if you need to. To get
File Tools | 151