Reverse Engineering for Beginners

CHAPTER 27. WORKING WITH FLOATING POINT NUMBERS USING SIMD CHAPTER 27. WORKING WITH FLOATING POINT NUMBERS USING SIMD

The constants are encoded by compiler in IEEE 754 format.

MULSDandADDSDwork just as the same, but do multiplication and addition.

The result of the function’s execution in typedoubleis left in the inXMM0register.

That is how non-optimizing MSVC works:

Listing 27.2: MSVC 2012 x64

real@4010666666666666 DQ 04010666666666666r ; 4.1
real@40091eb851eb851f DQ 040091eb851eb851fr ; 3.14

a$ = 8
b$ = 16
f PROC
movsdx QWORD PTR [rsp+16], xmm1
movsdx QWORD PTR [rsp+8], xmm0
movsdx xmm0, QWORD PTR a$[rsp]
divsd xmm0, QWORD PTR real@40091eb851eb851f
movsdx xmm1, QWORD PTR b$[rsp]
mulsd xmm1, QWORD PTR real@4010666666666666
addsd xmm0, xmm1
ret 0
f ENDP

Slightly redundant. The input arguments are saved in the “shadow space” (8.2.1 on page 91), but only their lower register
halves, i.e., only 64-bit values of typedouble. GCC produces the same code.

27.1.2 x86

Let’s also compile this example for x86. Despite the fact it’s generating for x86, MSVC 2012 uses SSE2 instructions:

Listing 27.3: Non-optimizing MSVC 2012 x86

tv70 = -8 ; size = 8
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_f PROC
push ebp
mov ebp, esp
sub esp, 8
movsd xmm0, QWORD PTR _a$[ebp]
divsd xmm0, QWORD PTR __real@40091eb851eb851f
movsd xmm1, QWORD PTR _b$[ebp]
mulsd xmm1, QWORD PTR __real@4010666666666666
addsd xmm0, xmm1
movsd QWORD PTR tv70[ebp], xmm0
fld QWORD PTR tv70[ebp]
mov esp, ebp
pop ebp
ret 0
_f ENDP

Listing 27.4: Optimizing MSVC 2012 x86

tv67 = 8 ; size = 8
_a$ = 8 ; size = 8
_b$ = 16 ; size = 8
_f PROC
movsd xmm1, QWORD PTR _a$[esp-4]
divsd xmm1, QWORD PTR __real@40091eb851eb851f
movsd xmm0, QWORD PTR _b$[esp-4]
mulsd xmm0, QWORD PTR __real@4010666666666666
addsd xmm1, xmm0
movsd QWORD PTR tv67[esp-4], xmm1
fld QWORD PTR tv67[esp-4]
ret 0
_f ENDP

Reverse Engineering for Beginners

27.1.2 x86

Get our desktop app

Company

Features

Documentation

Resources