Assembly Language for Beginners

(nextflipdebug2) #1

3.11. INLINE FUNCTIONS


Listing 3.43: memcpy() example

void memcpy_7(char inbuf, char outbuf)
{
memcpy(outbuf+10, inbuf, 7);
};


Listing 3.44: Optimizing MSVC 2010

_inbuf$ = 8 ; size = 4
_outbuf$ = 12 ; size = 4
_memcpy_7 PROC
mov ecx, DWORD PTR _inbuf$[esp-4]
mov edx, DWORD PTR [ecx]
mov eax, DWORD PTR _outbuf$[esp-4]
mov DWORD PTR [eax+10], edx
mov dx, WORD PTR [ecx+4]
mov WORD PTR [eax+14], dx
mov cl, BYTE PTR [ecx+6]
mov BYTE PTR [eax+16], cl
ret 0
_memcpy_7 ENDP


Listing 3.45: Optimizing GCC 4.8.1

memcpy_7:
push ebx
mov eax, DWORD PTR [esp+8]
mov ecx, DWORD PTR [esp+12]
mov ebx, DWORD PTR [eax]
lea edx, [ecx+10]
mov DWORD PTR [ecx+10], ebx
movzx ecx, WORD PTR [eax+4]
mov WORD PTR [edx+4], cx
movzx eax, BYTE PTR [eax+6]
mov BYTE PTR [edx+6], al
pop ebx
ret


That’s usually done as follows: 4-byte blocks are copied first, then a 16-bit word (if needed), then the last
byte (if needed).


Structures are also copied usingMOV:1.24.4 on page 361.


Long blocks


The compilers behave differently in this case.


Listing 3.46: memcpy() example

void memcpy_128(char inbuf, char outbuf)
{
memcpy(outbuf+10, inbuf, 128);
};


void memcpy_123(char inbuf, char outbuf)
{
memcpy(outbuf+10, inbuf, 123);
};


For copying 128 bytes, MSVC uses a singleMOVSDinstruction (because 128 divides evenly by 4):


Listing 3.47: Optimizing MSVC 2010

_inbuf$ = 8 ; size = 4
_outbuf$ = 12 ; size = 4
_memcpy_128 PROC
push esi
mov esi, DWORD PTR _inbuf$[esp]

Free download pdf