[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

(yzsuai) #1

the letter q or t. Note how find returns full directory paths that begin with the start
directory specification:


C:\...\PP4E\Tools> find.py [qx]*.py C:\temp\PP3E
C:\temp\PP3E\Examples\PP3E\Database\SQLscripts\querydb.py
C:\temp\PP3E\Examples\PP3E\Gui\Tools\queuetest-gui-class.py
C:\temp\PP3E\Examples\PP3E\Gui\Tools\queuetest-gui.py
C:\temp\PP3E\Examples\PP3E\Gui\Tour\quitter.py
C:\temp\PP3E\Examples\PP3E\Internet\Other\Grail\Question.py
C:\temp\PP3E\Examples\PP3E\Internet\Other\XML\xmlrpc.py
C:\temp\PP3E\Examples\PP3E\System\Threads\queuetest.py

And here’s some Python code that does the same find but also extracts base names and
file sizes for each file found:


C:\...\PP4E\Tools> python
>>> import os
>>> from find import find
>>> for name in find('[qx]*.py', r'C:\temp\PP3E'):
... print(os.path.basename(name), os.path.getsize(name))
...
querydb.py 635
queuetest-gui-class.py 1152
queuetest-gui.py 963
quitter.py 801
Question.py 817
xmlrpc.py 705
queuetest.py 1273

The fnmatch module


To achieve such code economy, the find module calls os.walk to walk the tree and
simply yields matching filenames along the way. New here, though, is the fnmatch
module—yet another Python standard library module that performs Unix-like pattern
matching against filenames. This module supports common operators in name pattern
strings: * to match any number of characters,? to match any single character, and
[...] and [!...] to match any character inside the bracket pairs or not; other characters
match themselves. Unlike the re module, fnmatch supports only common Unix shell
matching operators, not full-blown regular expression patterns; we’ll see why this dis-
tinction matters in Chapter 19.


Interestingly, Python’s glob.glob function also uses the fnmatch module to match
names: it combines os.listdir and fnmatch to match in directories in much the same
way our find.find combines os.walk and fnmatch to match in trees (though os.walk
ultimately uses os.listdir as well). One ramification of all this is that you can pass
byte strings for both pattern and start-directory to find.find if you need to suppress
Unicode filename decoding, just as you can for os.walk and glob.glob; you’ll receive
byte strings for filenames in the result. See Chapter 4 for more details on Unicode
filenames.


Searching Directory Trees | 323
Free download pdf