1.19. FLOATING-POINT UNIT
Thereby, the conditional jumps instructions listed here can be used after aFNSTSW/SAHFinstruction pair.
Apparently, the FPUC3/C2/C0status bits were placed there intentionally, to easily map them to base CPU
flags without additional permutations?
GCC 4.8.1 with-O3optimization turned on
Some new FPU instructions were added in the P6 Intel family^127. These areFUCOMI(compare operands
and set flags of the main CPU) andFCMOVcc(works likeCMOVcc, but on FPU registers).
Apparently, the maintainers of GCC decided to drop support of pre-P6 Intel CPUs (early Pentiums, 80486,
etc.).
And also, the FPU is no longer separate unit in P6 Intel family, so now it is possible to modify/check flags
of the main CPU from the FPU.
So what we get is:
Listing 1.214: Optimizing GCC 4.8.1
fld QWORD PTR [esp+4] ; load "a"
fld QWORD PTR [esp+12] ; load "b"
; ST0=b, ST1=a
fxch st(1)
; ST0=a, ST1=b
; compare "a" and "b"
fucomi st, st(1)
; copy ST1 ("b" here) to ST0 if a<=b
; leave "a" in ST0 otherwise
fcmovbe st, st(1)
; discard value in ST1
fstp st(1)
ret
Hard to guess whyFXCH(swap operands) is here.
It’s possible to get rid of it easily by swapping the first twoFLDinstructions or by replacingFCMOVBE(below
or equal) byFCMOVA(above). Probably it’s a compiler inaccuracy.
SoFUCOMIcomparesST(0)(a) andST(1)(b) and then sets some flags in the main CPU.FCMOVBEchecks
the flags and copiesST(1)(bhere at the moment) toST(0)(ahere) ifST0(a)<=ST1(b). Otherwise (a>b),
it leavesainST(0).
The lastFSTPleavesST(0)on top of the stack, discarding the contents ofST(1).
Let’s trace this function in GDB:
Listing 1.215: Optimizing GCC 4.8.1 and GDB
1 dennis@ubuntuvm:~/polygon$ gcc -O3 d_max.c -o d_max -fno-inline
2 dennis@ubuntuvm:~/polygon$ gdb d_max
3 GNU gdb (GDB) 7.6.1-ubuntu
4 ...
5 Reading symbols from /home/dennis/polygon/d_max...(no debugging symbols found)...done.
6 (gdb) b d_max
7 Breakpoint 1 at 0x80484a0
8 (gdb) run
9 Starting program: /home/dennis/polygon/d_max
10
11 Breakpoint 1, 0x080484a0 in d_max ()
12 (gdb) ni
13 0x080484a4 in d_max ()
14 (gdb) disas $eip
15 Dump of assembler code for function d_max:
16 0x080484a0 <+0>: fldl 0x4(%esp)
17 => 0x080484a4 <+4>: fldl 0xc(%esp)
18 0x080484a8 <+8>: fxch %st(1)
19 0x080484aa <+10>: fucomi %st(1),%st
20 0x080484ac <+12>: fcmovbe %st(1),%st
21 0x080484ae <+14>: fstp %st(1)
(^127) Starting at Pentium Pro, Pentium-II, etc.