CHAPTER 22. UNIONS CHAPTER 22. UNIONS
ARM64 has no instruction that can add a number to a FPU D-register, so the input value (that came in D0) is first copied into
GPR, incremented, copied to FPU register D1, and then subtraction occurs.
Listing 22.6: Optimizing GCC 4.9 ARM64
calculate_machine_epsilon:
fmov x0, d0 ; load input value of double type into X0
add x0, x0, 1 ; X0++
fmov d1, x0 ; move it to FPU register
fsub d0, d1, d0 ; subtract
ret
See also this example compiled for x64 with SIMD instructions:27.4 on page 421.
22.2.3 MIPS.
The new instruction here is MTC1 (“Move To Coprocessor 1”), it just transfers data fromGPRto the FPU’s registers.
Listing 22.7: Optimizing GCC 4.4.5 (IDA)
calculate_machine_epsilon:
mfc1 $v0, $f12
or $at, $zero ; NOP
addiu $v1, $v0, 1
mtc1 $v1, $f2
jr $ra
sub.s $f0, $f2, $f12 ; branch delay slot
22.2.4 Conclusion.
It’s hard to say whether someone may need this trickery in real-world code, but as was mentioned many times in this book,
this example serves well for explaining the IEEE 754 format andunions in C/C++.
22.3 Fast square root calculation.
Another well-known algorithm wherefloatis interpreted as integer is fast calculation of square root.
Listing 22.8: The source code is taken from Wikipedia:http://go.yurichev.com/17364
/* Assumes that float is in the IEEE 754 single precision floating point format
- and that int is 32 bits. /
float sqrt_approx(float z)
{
int val_int = (int)&z; / Same bits, but as an int /
/ - To justify the following code, prove that
- ((((val_int / 2^m) - b) / 2) + b) 2^m = ((val_int - 2^m) / 2) + ((b + 1) / 2) 2^m)
- where
- b = exponent bias
- m = number of mantissa bits
.
/
val_int -= 1 << 23; /* Subtract 2^m. */
val_int >>= 1; /* Divide by 2. */
val_int += 1 << 29; /* Add ((b + 1) / 2) * 2^m. */
return (float)&val_int; / Interpret again as float /
}