performed by the Python printer, not of Unicode file names in general. Because we have
no room for further exploration here, though, we’ll have to be satisfied with the fact
that our exception handler sidesteps the printing problem altogether. You should still
be aware of the implications of Unicode filename decoding, though; on some platforms
you may need to pass byte strings to os.walk in this script to prevent decoding errors
as filenames are created.*
Since Unicode is still relatively new in 3.1, be sure to test for such errors on your com-
puter and your Python. Also see also Python’s manuals for more on the treatment of
Unicode filenames, and the text Learning Python for more on Unicode in general. As
noted earlier, our scripts also had to open text files in binary mode because some might
contain undecodable content too. It might seem surprising that Unicode issues can crop
up in basic printing like this too, but such is life in the brave new Unicode world. Many
real-world scripts don’t need to care much about Unicode, of course—including those
we’ll explore in the next section.
Splitting and Joining Files
Like most kids, mine spent a lot of time on the Internet when they were growing up.
As far as I could tell, it was the thing to do. Among their generation, computer geeks
and gurus seem to have been held in the same sort of esteem that my generation once
held rock stars. When kids disappeared into their rooms, chances were good that they
were hacking on computers, not mastering guitar riffs (well, real ones, at least). It may
or may not be healthier than some of the diversions of my own misspent youth, but
that’s a topic for another kind of book.
Despite the rhetoric of techno-pundits about the Web’s potential to empower an up-
coming generation in ways unimaginable by their predecessors, my kids seemed to
spend most of their time playing games. To fetch new ones in my house at the time,
they had to download to a shared computer which had Internet access and transfer
those games to their own computers to install. (Their own machines did not have
Internet access until later, for reasons that most parents in the crowd could probably
expand upon.)
The problem with this scheme is that game files are not small. They were usually much
too big to fit on a floppy or memory stick of the time, and burning a CD or DVD took
away valuable game-playing time. If all the machines in my house ran Linux, this would
have been a nonissue. There are standard command-line programs on Unix for chop-
ping a file into pieces small enough to fit on a transfer device (split), and others for
- For a related print issue, see Chapter 14’s workaround for program aborts when printing stack tracebacks
to standard output from spawned programs. Unlike the problem described here, that issue does not appear
to be related to Unicode characters that may be unprintable in shell windows but reflects another regression
for standard output prints in general in Python 3.1, which may or may not be repaired by the time you read
this text. See also the Python environment variable PYTHONIOENCODING, which can override the default
encoding used for standard streams.
282 | Chapter 6: Complete System Programs