1.19. FLOATING-POINT UNIT
Listing 1.210: Non-optimizing MSVC 2010
PUBLIC _d_max
_TEXT SEGMENT
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_d_max PROC
push ebp
mov ebp, esp
fld QWORD PTR _b$[ebp]
; current stack state: ST(0) = _b
; compare _b (ST(0)) and _a, and pop register
fcomp QWORD PTR _a$[ebp]
; stack is empty here
fnstsw ax
test ah, 5
jp SHORT $LN1@d_max
; we are here only if a>b
fld QWORD PTR _a$[ebp]
jmp SHORT $LN2@d_max
$LN1@d_max:
fld QWORD PTR _b$[ebp]
$LN2@d_max:
pop ebp
ret 0
_d_max ENDP
So,FLDloads_bintoST(0).
FCOMPcompares the value inST(0)with what is in_aand setsC3/C2/C0bits in FPU status word register,
accordingly. This is a 16-bit register that reflects the current state of the FPU.
Afterthebitsareset,theFCOMPinstructionalsopopsonevariablefromthestack. Thisiswhatdistinguishes
it fromFCOM, which is just compares values, leaving the stack in the same state.
Unfortunately, CPUs before Intel P6^121 don’t have any conditional jumps instructions which check the
C3/C2/C0bits. Perhaps, it is a matter of history (recall: FPU was a separate chip in past).
Modern CPU starting at Intel P6 haveFCOMI/FCOMIP/FUCOMI/FUCOMIPinstructions —which do the same,
but modify theZF/PF/CFCPU flags.
TheFNSTSWinstruction copies FPU the status word register toAX.C3/C2/C0bits are placed at positions
14/10/8, they are at the same positions in theAXregister and all they are placed in the high part of
AX—AH.
- Ifb>ain our example, thenC3/C2/C0bits are to be set as following: 0, 0, 0.
- Ifa>b, then the bits are: 0, 0, 1.
- Ifa=b, then the bits are: 1, 0, 0.
- If the result is unordered (in case of error), then the set bits are: 1, 1, 1.
This is howC3/C2/C0bits are located in theAXregister:
14 10 9 8
C3 C2C1C0
This is howC3/C2/C0bits are located in theAHregister:
6 2 1 0
C3 C2C1C0
After the execution oftest ah, 5^122 , onlyC0andC2bits (on 0 and 2 position) are considered, all other
bits are just ignored.
(^121) Intel P6 is Pentium Pro, Pentium II, etc.
(^122) 5=101b