Assembly Language for Beginners

(Jeff_L) #1


If we compile the example with optimizations, it is to be almost the same, but the “scratch space” will not
be used, because it won’t be needed.

Also take a look on how MSVC 2012 optimizes the loading of primitive values into registers by usingLEA
(.1.6 on page 1028).MOVwould be 1 byte longer here (5 instead of 4).

Another example of such thing is:8.1.1 on page 797.

Windows x64: Passingthis(C/C++)

ThethispointerispassedinRCX,thefirstargumentofthemethodisinRDX,etc. Foranexamplesee:3.18.1
on page 544.

Linux x64

The way arguments are passed in Linux for x86-64 is almost the same as in Windows, but 6 registers are
used instead of 4 (RDI,RSI,RDX,RCX,R8,R9) and there is no “scratch space”, although thecalleemay
save the register values in the stack, if it needs/wants to.

Listing 6.8: Optimizing GCC 4.7.3

.string "%d %d %d %d %d %d %d\n"
sub rsp, 40
mov eax, DWORD PTR [rsp+48]
mov DWORD PTR [rsp+8], r9d
mov r9d, ecx
mov DWORD PTR [rsp], r8d
mov ecx, esi
mov r8d, edx
mov esi, OFFSET FLAT:.LC0
mov edx, edi
mov edi, 1
mov DWORD PTR [rsp+16], eax
xor eax, eax
call __printf_chk
add rsp, 40
sub rsp, 24
mov r9d, 6
mov r8d, 5
mov DWORD PTR [rsp], 7
mov ecx, 4
mov edx, 3
mov esi, 2
mov edi, 1
call f1
add rsp, 24

N.B.: here the values are written into the 32-bit parts of the registers (e.g., EAX) but not in the whole
64-bit register (RAX). This is because each write to the low 32-bit part of a register automatically clears
the high 32 bits. Supposedly, it was decided in AMD to do so to simplify porting code to x86-64.

6.1.6 Return values offloatanddoubletype

In all conventions except in Win64, the values of typefloatordoubleare returned via the FPU register

In Win64, the values offloatanddoubletypes are returned in the low 32 or 64 bits of theXMM0register.

Free download pdf