97 Things Every Programmer Should Know

(Chris Devlin) #1

Collective Wisdom from the Experts 177


Unix tools were developed in an age when a multiuser computer had 128KB
of RAM. The ingenuity that went into their design means that nowadays they
can handle huge data sets extremely efficiently. Most tools work like filters,
processing just a single line at the time, meaning that there is no upper limit in
the amount of data they can handle. You want to search for the number of edits
stored in the half-terabyte English Wikipedia dump? A simple invocation of


grep '<revision>' | wc –l

will give you the answer without sweat. If you find a command sequence gen-
erally useful, you can easily package it into a shell script, using some uniquely
powerful programming constructs, such as piping data into loops and condi-
tionals. Even more impressively, Unix commands executing as pipelines, like
the preceding one, will naturally distribute their load among the many pro-
cessing units of modern multicore CPUs.


The small-is-beautiful provenance and open source implementations of the
Unix tools make them ubiquitously available, even on resource-constrained
platforms, like my set-top media player or DSL router. Such devices are
unlikely to offer a powerful graphical user interface, but they often include the
BusyBox application, which provides the most commonly used tools. And if
you are developing on Windows, the Cygwin environment offers you all imag-
inable Unix tools, both as executables and in source code form.


Finally, if none of the available tools matches your needs, it’s very easy to extend
the world of the Unix tools. Just write a program (in any language you fancy)
that plays by a few simple rules: your program should perform just a single
task; it should read data as text lines from its standard input; and it should dis-
play its results unadorned by headers and other noise on its standard output.
Parameters affecting the tool’s operation are given in the command line. Fol-
low these rules, and “yours is the Earth and everything that’s in it.”

Free download pdf