A Quick Game of “Find the Biggest Python File”
Quick: what’s the biggest Python source file on your computer? This was the query
innocently posed by a student in one of my Python classes. Because I didn’t know either,
it became an official exercise in subsequent classes, and it provides a good example of
ways to apply Python system tools for a realistic purpose in this book. Really, the query
is a bit vague, because its scope is unclear. Do we mean the largest Python file in a
directory, in a full directory tree, in the standard library, on the module import search
path, or on your entire hard drive? Different scopes imply different solutions.
Scanning the Standard Library Directory
For instance, Example 6-1 is a first-cut solution that looks for the biggest Python file
in one directory—a limited scope, but enough to get started.
Example 6-1. PP4E\System\Filetools\bigpy-dir.py
"""
Find the largest Python source file in a single directory.
Search Windows Python source lib, unless dir command-line arg.
"""
import os, glob, sys
dirname = r'C:\Python31\Lib' if len(sys.argv) == 1 else sys.argv[1]
allsizes = []
allpy = glob.glob(dirname + os.sep + '*.py')
for filename in allpy:
filesize = os.path.getsize(filename)
allsizes.append((filesize, filename))
allsizes.sort()
print(allsizes[:2])
print(allsizes[-2:])
This script uses the glob module to run through a directory’s files and detects the largest
by storing sizes and names on a list that is sorted at the end—because size appears first
in the list’s tuples, it will dominate the ascending value sort, and the largest percolates
to the end of the list. We could instead keep track of the currently largest as we go, but
the list scheme is more flexible. When run, this script scans the Python standard li-
brary’s source directory on Windows, unless you pass a different directory on the com-
mand line, and it prints both the two smallest and largest files it finds:
C:\...\PP4E\System\Filetools> bigpy-dir.py
[(0, 'C:\\Python31\\Lib\\build_class.py'), (56, 'C:\\Python31\\Lib\\struct.py')]
[(147086, 'C:\\Python31\\Lib\\turtle.py'), (211238, 'C:\\Python31\\Lib\\decimal.
py')]
C:\...\PP4E\System\Filetools> bigpy-dir.py.
[(21, '.\\__init__.py'), (461, '.\\bigpy-dir.py')]
272 | Chapter 6: Complete System Programs