[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1
>>> for (i, (ac, bc)) in enumerate(zip(a, b)):
... if ac != bc:
... print(i, repr(ac), repr(bc))
... print(repr(a[i-20:i+20]))
... print(repr(b[i-20:i+20]))
... break

37966 '\r' '\n'
're>\r\ndef min(*args):\r\n tmp = list(arg'
're>\r\ndef min(*args):\n tmp = list(args'

Apparently, I wound up with a Unix line end at one point in the local copy and a DOS
line end in the version I downloaded—the combined effect of the text mode used by
the download script itself (which translated \n to \r\n) and years of edits on both Linux
and Windows PDAs and laptops (I probably coded this change on Linux and copied
it to my local Windows copy in binary mode). Code such as this could be integrated
into the diffall script to make it more intelligent about text files and difference
reporting.


Because Python excels at processing files and strings, it’s even possible to go one step
further and code a Python equivalent of the fc and diff commands. In fact, much of
the work has already been done; the standard library module difflib could make this
task simple. See the Python library manual for details and usage examples.


We could also be smarter by avoiding the load and compare steps for files that differ
in size, and we might use a smaller block size to reduce the script’s memory require-
ments. For most trees, such optimizations are unnecessary; reading multimegabyte files
into strings is very fast in Python, and garbage collection reclaims the space as you go.


Since such extensions are beyond both this script’s scope and this chapter’s size limits,
though, they will have to await the attention of a curious reader (this book doesn’t have
formal exercises, but that almost sounds like one, doesn’t it?). For now, let’s move on
to explore ways to code one more common directory task: search.


Searching Directory Trees


Engineers love to change things. As I was writing this book, I found it almost irresisti-
ble to move and rename directories, variables, and shared modules in the book exam-
ples tree whenever I thought I’d stumbled onto a more coherent structure. That was
fine early on, but as the tree became more intertwined, this became a maintenance
nightmare. Things such as program directory paths and module names were hardcoded
all over the place—in package import statements, program startup calls, text notes,
configuration files, and more.


One way to repair these references, of course, is to edit every file in the directory by
hand, searching each for information that has changed. That’s so tedious as to be utterly
impossible in this book’s examples tree, though; the examples of the prior edition con-
tained 186 directories and 1,429 files! Clearly, I needed a way to automate updates after


Searching Directory Trees | 319
Free download pdf