You can use the copy function’s verbose argument to trace the process if you wish. At
the time I wrote this edition in 2010, this test run copied a tree of 1,430 files and 185
directories in 10 seconds on my woefully underpowered netbook machine (the built-
in time.clock call is used to query the system time in seconds); it may run arbitrarily
faster or slower for you. Still, this is at least as fast as the best drag-and-drop I’ve timed
on this machine.
So how does this script work around bad files on a CD backup? The secret is that it
catches and ignores file exceptions, and it keeps walking. To copy all the files that are
good on a CD, I simply run a command line such as this one:
C:\...\PP4E\System\Filetools> python cpall.py G:\Examples C:\PP3E\Examples
Because the CD is addressed as “G:” on my Windows machine, this is the command-
line equivalent of drag-and-drop copying from an item in the CD’s top-level folder,
except that the Python script will recover from errors on the CD and get the rest. On
copy errors, it prints a message to standard output and continues; for big copies, you’ll
probably want to redirect the script’s output to a file for later inspection.
In general, cpall c a n b e p a s s e d a n y a b s o l u t e d i r e c t o r y p a t h o n y o u r m a c h i n e , e v e n t h o s e
that indicate devices such as CDs. To make this go on Linux, try a root directory such
as /dev/cdrom or something similar to address your CD drive. Once you’ve copied a
tree this way, you still might want to verify; to see how, let’s move on to the next
example.
Comparing Directory Trees
Engineers can be a paranoid sort (but you didn’t hear that from me). At least I am. It
comes from decades of seeing things go terribly wrong, I suppose. When I create a CD
backup of my hard drive, for instance, there’s still something a bit too magical about
the process to trust the CD writer program to do the right thing. Maybe I should, but
it’s tough to have a lot of faith in tools that occasionally trash files and seem to crash
my Windows machine every third Tuesday of the month. When push comes to shove,
it’s nice to be able to verify that data copied to a backup CD is the same as the original—
or at least to spot deviations from the original—as soon as possible. If a backup is ever
needed, it will be really needed.
Because data CDs are accessible as simple directory trees in the file system, we are once
again in the realm of tree walkers—to verify a backup CD, we simply need to walk its
top-level directory. If our script is general enough, we will also be able to use it to verify
other copy operations as well—e.g., downloaded tar files, hard-drive backups, and so
on. In fact, the combination of the cpall script of the prior section and a general tree
comparison would provide a portable and scriptable way to copy and verify data sets.
We’ve already studied generic directory tree walkers, but they won’t help us here di-
rectly: we need to walk two directories in parallel and inspect common files along the
way. Moreover, walking either one of the two directories won’t allow us to spot files
308 | Chapter 6: Complete System Programs
Do
wnload from Wow! eBook <www.wowebook.com>