Programming and Problem Solving with Java

(^596) | Multidimensional Arrays and Numeric Computation
Here is the result of adding xto the sum of yand z:
(y) 1325000 100
(z) 5424 100
1330424 100 = 1330 103 (truncated to four digits)
(yz) 1330 103
(x) –1324 103
6 103 = 6000 100 ←x+ (y+ z)
These two answers are the same in the thousands place but are different there-
after. This discrepancy results from a representational error.
Representational error makes it unwise to use a floating-point variable as a
loop control variable. Because precision may be lost in calculations involving float-
ing-point numbers, it is difficult to predict when (or even if) a loop control variable
of type float(or double) will equal the termination value. As a consequence, a count-
controlled loop with a floating-point control variable can behave unpredictably.
Also because of representational errors, you should never compare floating-
point numbers for exact equality. Rarely are two floating-point numbers exactly
equal, and thus you should compare them only for near equality. If the difference between the
two numbers is less than some acceptable small value, you can consider them equal for the pur-
poses of the given problem.

Implementation of Floating-Point Numbers in the Computer

All computers limit the precision of floating-point numbers, although modern machines use binary rather than decimal arithmetic. In our representation, we used only five digits to simplify the examples. In fact, some computers really are limited to only four or five digits of precision. Other systems provide 6 significant digits, 15 significant digits, and 19 significant digits, respectively, for three sizes of floating-point types. We have shown only a single- digit exponent, but most systems allow two digits for the smaller floating-point type and up to four-digit exponents for a longer type. Some languages leave the range and precision of floating-point types to each individual compiler. Java, however, states the range and precision in the language specification in the following formula: sm 2 e

where sis +1 or 1,mis a positive integer less than 2^24 , and eis between 126 and 127, in- clusive, for values of type float. For values of type double,mis less than 2^53 and eis between 1,022 and 1,023. No, we don’t expect you to calculate this value. Each Java numeric class (such as Integeror Double) provides two constants,MAX_VALUEand MIN_VALUEthat contain those values.. When you declare a floating-point variable, part of the memory location contains the exponent, and the number itself (called themantissa) is assumed to be in the balance of the

Representational error An
arithmetic error that occurs
when the precision of the true
result of an arithmetic operation
is greater than the precision of
the machine

Programming and Problem Solving with Java

Implementation of Floating-Point Numbers in the Computer

Get our desktop app

Company

Features

Documentation

Resources