ugh.book

(singke) #1

192 Programming


a_function()
{
char c,buff[80];
int i = 0;

while ((c = getchar()) != '\n')
buff[i++] = c;
buff[i] = '\000';
do_it(buff);
}

Code like this litters Unix. Note how the stack buffer is 80 characters
long—because most Unix files only have lines that are 80 character long.
Note also how there is no bounds check before a new character is stored in
the character array and no test for an end-of-file condition. The bounds
check is probably missing because the programmer likes how the assign-
ment statement (c = getchar()) is embedded in the loop conditional of the
while statement. There is no room to check for end-of-file because that line
of code is already testing for the end of a line. Believe it or not, some peo-
ple actually praise C for just this kind of terseness—understandability and
maintainability be damned! Finally, do_it is called, and the character array
suddenly becomes a pointer, which is passed as the first function argument.

Exercise for the reader: What happens to this function when an end-of-file
condition occurs in the middle of a line of input?

When Unix users discover these built-in limits, they tend not to think that
the bugs should be fixed. Instead, users develop ways to cope with the situ-
ation. For example, tar, the Unix “tape archiver,” can’t deal with path
names longer than 100 characters (including directories). Solution: don’t
use tar to archive directories to tape; use dump. Better solution: Don’t use
deep subdirectories, so that a file’s absolute path name is never longer than
100 characters. The ultimate example of careless Unix programming will
probably occur at 10:14:07 p.m. on January 18, 2038, when Unix’s 32-bit
timeval field overflows...

To continue with our example, let’s imagine that our function is called
upon to read a line of input that is 85 characters long. The function will
read the 85 characters with no problem but where do the last 5 characters
end up? The answer is that they end up scribbling over whatever happened
to be in the 5 bytes right after the character array. What was there before?

The two variables, c and i, might be allocated right after the character array
and therefore might be corrupted by the 85-character input line. What
about an 850-character input line? It would probably overwrite important
Free download pdf