B.D. McCullough 1297
represented as 0000, one is represented as 0001, two is represented as 0010, and
three is represented as 0011. The biggest number that can be represented is 15,
which is represented as 1111 = 1000 (8) + 0100 (4) + 0010 (2) + 0001 (1).
With a 32-bit word, rather than devote all 32 bits to representing powers of
two, we use some for amantissaand others for an exponent (which can be either
positive or negative) to which the number two can be raised. This two raised to the
exponent is then multiplied by the mantissa. Such a scheme provides for a wider
range of representable numbers. This single-precision scheme has one sign bit,
eight bits for the exponent (which can range from−126 to 127) and 23 bits for the
mantissa. The smallest mantissa is 22 zeros followed by a one, which equals one,
and the smallest exponent is−126, so the smallest number that can be represented
is 1× 2 −^126 ≈1.2× 10 −^38. Similarly, the largest number that can be represented
is( 2 − 2 −^23 )× 2127 ≈3.4× 1038. If two words are chained together to permit a
larger exponent and larger mantissa, then this is called “double precision.”
Because a computer represents numbers in base-2, it cannot accurately represent
all the real numbers. For example, the number 0.5 can be represented exactly,
since it equals 1/2, which is 2−^1. The number 0.1 cannot be represented exactly.
The binary representation of the real number 0.1 is given by 0.0001100110011...
where the 0011 repeats infinitely. With a finite word length, this infinite sequence
must be truncated, and when it is truncated and converted back to base-10, we get
0.099999994. For a quick overview of this topic, see McCullough and Vinod (1999,
sec. 2.1). For a much more detailed, yet still very accessible discussion of computer
arithmetic, see Goldberg (1991).
This small difference between 0.1 and 0.099999994 is an example of rounding
error. A similar type of inaccuracy is calledtruncation error, an example of which is
the calculation of sin(x)by infinite series. In a computer, the series cannot be cumu-
lated infinitely, and must be terminated at some point. The difference between
termininating and continuing forever is truncation error. Like a rounding error,
it can be very small. Some calculations, e.g., matrix inversion, can require mil-
lions of operations, and these small rounding and/or truncation errors can add up.
Eventually, they can swamp all the accurate digits, producing a final answer that
is completely inaccurate. This is very probably what happened in Longley’s paper.
28.3 Introductory tests
In 1985, Leland Wilkinson, developer of the SYSTAT statistical software package,
produced a pamphlet describing some simple tests of software accuracy. The pri-
mary documents for understanding and applying Wilkinson tests are Wilkinson
(1985) and McCullough (2004a). The tests are all based on a dataset that he called
the “Nasty” dataset, which is reproduced in Table 28.3.
The Wilkinson tests are fully described in Wilkinson (1985). They have been
applied by Sawitzki (1994a, 1994b), Bankhofer and Hilbert (1997a, 1997b), McCul-
lough (2004a) and Choi and Kiefer (2005). These papers usually report errors in
packages. For example, some packages could not accurately compute the sample
standard deviation for either BIG or LITTLE. What this reveals is that the packages