This code loops through all the files at each level, looking for files with .py at the end
of their names and which contain the search string. When a match is found, its full
name is appended to the results list object; alternatively, we could also simply build a
list of all .py files and search each in a for loop after the walk. Since we’re going to code
much more general solution to this type of problem in Chapter 6, though, we’ll let this
stand for now.
If you want to see what’s really going on in the os.walk generator, call its next
method (or equivalently, pass it to the next built-in function) manually a few times,
just as the for loop does automatically; each time, you advance to the next subdirectory
in the tree:
>>> gen = os.walk(r'C:\temp\test')
>>> gen.__next__()
('C:\\temp\\test', ['parts'], ['random.bin', 'spam.txt', 'temp.bin', 'temp.txt'])
>>> gen.__next__()
('C:\\temp\\test\\parts', [], ['part0001', 'part0002', 'part0003', 'part0004'])
>>> gen.__next__()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
The library manual documents os.walk further than we will here. For instance, it sup-
ports bottom-up instead of top-down walks with its optional topdown=False argument,
and callers may prune tree branches by deleting names in the subdirectories lists of the
yielded tuples.
Internally, the os.walk call generates filename lists at each level with the os.listdir call
we met earlier, which collects both file and directory names in no particular order and
returns them without their directory paths; os.walk segregates this list into subdirec-
tories and files (technically, nondirectories) before yielding a result. Also note that
walk uses the very same subdirectories list it yields to callers in order to later descend
into subdirectories. Because lists are mutable objects that can be changed in place, if
your code modifies the yielded subdirectory names list, it will impact what walk does
next. For example, deleting directory names will prune traversal branches, and sorting
the list will order the walk.
Recursive os.listdir traversals
The os.walk tool does the work of tree traversals for us; we simply provide loop code
with task-specific logic. However, it’s sometimes more flexible and hardly any more
work to do the walking ourselves. The following script recodes the directory listing
script with a manual recursive traversal function (a function that calls itself to repeat
its actions). The mylister function in Example 4-5 is almost the same as lister in
Example 4-4 but calls os.listdir to generate file paths manually and calls itself recur-
sively to descend into subdirectories.
Directory Tools | 171