[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

Again, change this script’s trace variable if you want to track its progress through the
tree. As you can see, the results for largest files differ when viewed by size and lines—
a disparity which we’ll probably have to hash out in our next requirements meeting.


Scanning the Entire Machine


Finally, although searching trees rooted in the module import path normally includes
every Python source file you can import on your computer, it’s still not complete.
Technically, this approach checks only modules; Python source files which are top-
level scripts run directly do not need to be included in the module path. Moreover, the
module search path may be manually changed by some scripts dynamically at runtime
(for example, by direct sys.path updates in scripts that run on web servers) to include
additional directories that Example 6-3 won’t catch.


Ultimately, finding the largest source file on your computer requires searching your
entire drive—a feat which our tree searcher in Example 6-2 almost supports, if we
generalize it to accept the root directory name as an argument and add some of the bells
and whistles of the path searcher version (we really want to avoid visiting the same
directory twice if we’re scanning an entire machine, and we might as well skip errors
and check line-based sizes if we’re investing the time). Example 6-4 implements such
general tree scans, outfitted for the heavier lifting required for scanning drives.


Example 6-4. PP4E\System\Filetools\bigext-tree.py


"""
Find the largest file of a given type in an arbitrary directory tree.
Avoid repeat paths, catch errors, add tracing and line count size.
Also uses sets, file iterators and generator to avoid loading entire
file, and attempts to work around undecodable dir/file name prints.
"""


import os, pprint
from sys import argv, exc_info


trace = 1 # 0=off, 1=dirs, 2=+files
dirname, extname = os.curdir, '.py' # default is .py files in cwd
if len(argv) > 1: dirname = argv[1] # ex: C:\, C:\Python31\Lib
if len(argv) > 2: extname = argv[2] # ex: .pyw, .txt
if len(argv) > 3: trace = int(argv[3]) # ex: ". .py 2"


def tryprint(arg):
try:
print(arg) # unprintable filename?
except UnicodeEncodeError:
print(arg.encode()) # try raw byte string


visited = set()
allsizes = []
for (thisDir, subsHere, filesHere) in os.walk(dirname):
if trace: tryprint(thisDir)
thisDir = os.path.normpath(thisDir)


276 | Chapter 6: Complete System Programs

Free download pdf