[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
the listing’s text into a list of filenames. We can pass it a remote directory to be
listed; by default it lists the current server directory. A related FTP method, dir,
returns the list of line strings produced by an FTP LIST command; its result is like
typing a dir command in an FTP session, and its lines contain complete file infor-
mation, unlike nlst. If you need to know more about all the remote files, parse the
result of a dir method call (we’ll see how in a later example).
Notice how we skip “.” and “..” current and parent directory indicators if present
in remote directory listings; unlike os.listdir, some (but not all) servers include
these, so we need to either skip these or catch the exceptions they may trigger (more
on this later when we start using dir, too).

Selecting transfer modes with mimetypes
We discussed output file modes for FTP earlier, but now that we’ve started trans-
ferring text, too, I can fill in the rest of this story. To handle Unicode encodings
and to keep line-ends in sync with the machines that my web files live on, this script
distinguishes between binary and text file transfers. It uses the Python mimetypes
module to choose between text and binary transfer modes for each file.
We met mimetypes in Chapter 6 near Example 6-23, where we used it to play media
files (see the examples and description there for an introduction). Here, mime
types is used to decide whether a file is text or binary by guessing from its filename
extension. For instance, HTML web pages and simple text files are transferred as
text with automatic line-end mappings, and images and tar archives are transferred
in raw binary mode.


Downloading: text versus binary
For binary files data is pulled down with the retrbinary method we met earlier,
and stored in a local file with binary open mode of wb. This file open mode is
required to allow for the bytes strings passed to the write method by retrbinary,
but it also suppresses line-end byte mapping and Unicode encodings in the process.
Again, text mode requires encodable text in Python 3.X, and this fails for binary
data like images. This script may also be run on Windows or Unix-like platforms,
and we don’t want a \n byte embedded in an image to get expanded to \r\n on
Windows. We don’t use a chunk-size third argument for binary transfers here,
though—it defaults to a reasonable size if omitted.
For text files, the script instead uses the retrlines method, passing in a function
to be called for each line in the text file downloaded. The text line handler function
receives lines in str string form, and mostly just writes the line to a local text file.
But notice that the handler function created by the lambda here also adds a \n line-
end character to the end of the line it is passed. Python’s retrlines method strips
all line-feed characters from lines to sidestep platform differences. By adding a
\n, the script ensures the proper line-end marker character sequence for the local
platform on which this script runs when written to the file (\n or \r\n).
For this auto-mapping of the \n in the script to work, of course, we must also open
text output files in w text mode, not in wb—the mapping from \n to \r\n on


Transferring Directories with ftplib | 877
Free download pdf