Assembly Language for Beginners

(nextflipdebug2) #1

1.31. WORKING WITH FLOATING POINT NUMBERS USING SIMD


Listing 1.395: MSVC 2012 x64

real@4010666666666666 DQ 04010666666666666r ; 4.1
real@40091eb851eb851f DQ 040091eb851eb851fr ; 3.14


a$ = 8
b$ = 16
f PROC
movsdx QWORD PTR [rsp+16], xmm1
movsdx QWORD PTR [rsp+8], xmm0
movsdx xmm0, QWORD PTR a$[rsp]
divsd xmm0, QWORD PTR real@40091eb851eb851f
movsdx xmm1, QWORD PTR b$[rsp]
mulsd xmm1, QWORD PTR
real@4010666666666666
addsd xmm0, xmm1
ret 0
f ENDP


Slightly redundant. The input arguments are saved in the “shadow space” (1.10.2 on page 100), but only
their lower register halves, i.e., only 64-bit values of typedouble. GCC produces the same code.


x86


Let’s also compile this example for x86. Despite the fact it’s generating for x86, MSVC 2012 uses SSE2
instructions:


Listing 1.396: Non-optimizing MSVC 2012 x86

tv70 = -8 ; size = 8
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_f PROC
push ebp
mov ebp, esp
sub esp, 8
movsd xmm0, QWORD PTR _a$[ebp]
divsd xmm0, QWORD PTR __real@40091eb851eb851f
movsd xmm1, QWORD PTR _b$[ebp]
mulsd xmm1, QWORD PTR __real@4010666666666666
addsd xmm0, xmm1
movsd QWORD PTR tv70[ebp], xmm0
fld QWORD PTR tv70[ebp]
mov esp, ebp
pop ebp
ret 0
_f ENDP


Listing 1.397: Optimizing MSVC 2012 x86

tv67 = 8 ; size = 8
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_f PROC
movsd xmm1, QWORD PTR _a$[esp-4]
divsd xmm1, QWORD PTR __real@40091eb851eb851f
movsd xmm0, QWORD PTR _b$[esp-4]
mulsd xmm0, QWORD PTR __real@4010666666666666
addsd xmm1, xmm0
movsd QWORD PTR tv67[esp-4], xmm1
fld QWORD PTR tv67[esp-4]
ret 0
_f ENDP


It’s almost the same code, however, there are some differences related to calling conventions: 1) the
arguments are passed not in XMM registers, but in the stack, like in the FPU examples (1.19 on page 218);
2) the result of the function is returned inST(0)— in order to do so, it’s copied (through local variabletv)
from one of the XMM registers toST(0).

Free download pdf