Assembly Language for Beginners

(nextflipdebug2) #1

1.28. 64-BIT VALUES IN 32-BIT ENVIRONMENT


var_10 = -0x10
var_4 = -4


lui $gp, (__gnu_local_gp >> 16)
addiu $sp, -0x20
la $gp, (__gnu_local_gp & 0xFFFF)
sw $ra, 0x20+var_4($sp)
sw $gp, 0x20+var_10($sp)
lw $t9, (__umoddi3 & 0xFFFF)($gp)
or $at, $zero
jalr $t9
or $at, $zero
lw $ra, 0x20+var_4($sp)
or $at, $zero
jr $ra
addiu $sp, 0x20

There are a lot ofNOPs, probably delay slots filled after the multiplication instruction (it’s slower than
other instructions, after all).


1.28.4 Shifting right


#include <stdint.h>


uint64_t f (uint64_t a)
{
return a>>7;
};


x86


Listing 1.382: Optimizing MSVC 2012 /Ob1

_a$ = 8 ; size = 8
_f PROC
mov eax, DWORD PTR _a$[esp-4]
mov edx, DWORD PTR _a$[esp]
shrd eax, edx, 7
shr edx, 7
ret 0
_f ENDP


Listing 1.383: Optimizing GCC 4.8.1 -fno-inline

_f:
mov edx, DWORD PTR [esp+8]
mov eax, DWORD PTR [esp+4]
shrd eax, edx, 7
shr edx, 7
ret


Shifting also occurs in two passes: first the lower part is shifted, then the higher part. But the lower part
is shifted with the help of theSHRDinstruction, it shifts the value ofEAXby 7 bits, but pulls new bits from
EDX, i.e., from the higher part. In other words, 64-bit value fromEDX:EAXregister’s pair, as a whole, is
shifted by 7 bits and lowest 32 bits of result are placed intoEAX. The higher part is shifted using the much
more popularSHRinstruction: indeed, the freed bits in the higher part must be filled with zeros.


ARM


ARM doesn’t have such instruction asSHRDin x86, so the Keil compiler ought to do this using simple shifts
andORoperations:

Free download pdf