[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

Skipping .\__init__.pyc 7 => .\Preview\attachgui.py 8 => .\Preview\bob.pkl Skipping .\Preview\bob.pkl ...more lines omitted: pauses for Enter key press at matches... Found in 2 files, visited 184

The script lists each file it checks as it goes, tells you which files it is skipping (names
that end in extensions not listed in the variable textexts that imply binary data), and
pauses for an Enter key press each time it announces a file containing the search string.
The search_all script works the same way when it is imported rather than run, but
there is no final statistics output line (fcount and vcount live in the module and so would
have to be imported to be inspected here):

C:\...\PP4E\dev\Examples\PP4E> python >>> import Tools.search_all >>> search_all.searcher(r'C:\temp\PP3E\Examples', 'mimetypes') ...more lines omitted: 8 pauses for Enter key press along the way... >>> search_all.fcount, search_all.vcount # matches, files (8, 1429)

However launched, this script tracks down all references to a string in an entire directory
tree: a name of a changed book examples file, object, or directory, for instance. It’s
exactly what I was looking for—or at least I thought so, until further deliberation drove
me to seek more complete and better structured solutions, the topic of the next section.

Be sure to also see the coverage of regular expressions in Chapter 19. The search_all script here searches for a simple string in each file with the in string membership expression, but it would be trivial to extend it to search for a regular expression pattern match instead (roughly, just replace in with a call to a regular expression object’s search method). Of course, such a mutation will be much more trivial after we’ve learned how. Also notice the textexts list in Example 6-17, which attempts to list all possible binary file types: it would be more general and robust to use the mimetypes logic we will meet near the end of this chapter in order to guess file content type from its name, but the skips list provides more control and sufficed for the trees I used this script against. Finally note that for simplicity many of the directory searches in this chapter assume that text is encoded per the underlying platform’s Uni- code default. They could open text in binary mode to avoid decoding errors, but searches might then be inaccurate because of encoding scheme differences in the raw encoded bytes. To see how to do better, watch for the “grep” utility in Chapter 11’s PyEdit GUI, which will apply an encoding name to all the files in a searched tree and ignore those text or binary files that fail to decode.

Searching Directory Trees | 329

[Python编程(第4版)].(Programming.Python.4th.Edition).Mark.Lutz.文字版

Get our desktop app

Company

Features

Documentation

Resources