Assembly Language for Beginners

1.31. WORKING WITH FLOATING POINT NUMBERS USING SIMD

Listing 1.395: MSVC 2012 x64

real@4010666666666666 DQ 04010666666666666r ; 4.1
real@40091eb851eb851f DQ 040091eb851eb851fr ; 3.14

a$ = 8
b$ = 16
f PROC
movsdx QWORD PTR [rsp+16], xmm1
movsdx QWORD PTR [rsp+8], xmm0
movsdx xmm0, QWORD PTR a$[rsp]
divsd xmm0, QWORD PTR real@40091eb851eb851f
movsdx xmm1, QWORD PTR b$[rsp]
mulsd xmm1, QWORD PTR real@4010666666666666
addsd xmm0, xmm1
ret 0
f ENDP

Slightly redundant. The input arguments are saved in the “shadow space” (1.10.2 on page 100), but only
their lower register halves, i.e., only 64-bit values of typedouble. GCC produces the same code.

x86

Let’s also compile this example for x86. Despite the fact it’s generating for x86, MSVC 2012 uses SSE2
instructions:

Listing 1.396: Non-optimizing MSVC 2012 x86

tv70 = -8 ; size = 8
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_f PROC
push ebp
mov ebp, esp
sub esp, 8
movsd xmm0, QWORD PTR _a$[ebp]
divsd xmm0, QWORD PTR __real@40091eb851eb851f
movsd xmm1, QWORD PTR _b$[ebp]
mulsd xmm1, QWORD PTR __real@4010666666666666
addsd xmm0, xmm1
movsd QWORD PTR tv70[ebp], xmm0
fld QWORD PTR tv70[ebp]
mov esp, ebp
pop ebp
ret 0
_f ENDP

Listing 1.397: Optimizing MSVC 2012 x86

tv67 = 8 ; size = 8
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_f PROC
movsd xmm1, QWORD PTR _a$[esp-4]
divsd xmm1, QWORD PTR __real@40091eb851eb851f
movsd xmm0, QWORD PTR _b$[esp-4]
mulsd xmm0, QWORD PTR __real@4010666666666666
addsd xmm1, xmm0
movsd QWORD PTR tv67[esp-4], xmm1
fld QWORD PTR tv67[esp-4]
ret 0
_f ENDP

It’s almost the same code, however, there are some differences related to calling conventions: 1) the
arguments are passed not in XMM registers, but in the stack, like in the FPU examples (1.19 on page 218);
2) the result of the function is returned inST(0)— in order to do so, it’s copied (through local variabletv)
from one of the XMM registers toST(0).

Assembly Language for Beginners

Get our desktop app

Company

Features

Documentation

Resources