[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

and the xml package in Unicode str. You must even be aware of the 3.X text/binary
distinction when using system tools like pipe descriptors and sockets, because they
transfer data as byte strings today (though their content can be decoded and encoded
as Unicode text if needed).


Moreover, because text-mode files require that content be decodable per a Unicode
encoding scheme, you must read undecodable file content in binary mode, as byte
strings (or catch Unicode exceptions in try statements and skip the file altogether).
This may include both truly binary files as well as text files that use encodings that are
nondefault and unknown. As we’ll see later in this chapter, because str strings are
always Unicode in 3.X, it’s sometimes also necessary to select byte string mode for the
names of files in directory tools such as os.listdir, glob.glob, and os.walk if they
cannot be decoded (passing in byte strings essentially suppresses decoding).


In fact, we’ll see examples where the Python 3.X distinction between str text and
bytes binary pops up in tools beyond basic files throughout this book—in Chapters
5 and 12 when we explore sockets; in Chapters 6 and 11 when we’ll need to ignore
Unicode errors in file and directory searches; in Chapter 12, where we’ll see how client-
side Internet protocol modules such as FTP and email, which run atop sockets, imply
file modes and encoding requirements; and more.


But just as for string types, although we will see some of these concepts in action in this
chapter, we’re going to take much of this story as a given here. File and string objects
are core language material and are prerequisite to this text. As mentioned earlier, be-
cause they are addressed by a 45-page chapter in the book Learning Python, Fourth
Edition, I won’t repeat their coverage in full in this book. If you find yourself confused
by the Unicode and binary file and string concepts in the following sections, I encourage
you to refer to that text or other resources for more background information in this
domain.


Using Built-in File Objects


Despite the text/binary dichotomy in Python 3.X, files are still very straightforward to
use. For most purposes, in fact, the open built-in function and its files objects are all
you need to remember to process files in your scripts. The file object returned by
open has methods for reading data (read, readline, readlines); writing data (write,
writelines); freeing system resources (close); moving to arbitrary positions in the file
(seek); forcing data in output buffers to be transferred to disk (flush); fetching the
underlying file handle (fileno); and more. Since the built-in file object is so easy to use,
let’s jump right into a few interactive examples.


Output files


To make a new file, call open with two arguments: the external name of the file to be
created and a mode string w (short for write). To store data on the file, call the file object’s
write method with a string containing the data to store, and then call the close method


File Tools | 137
Free download pdf