3.11. INLINE FUNCTIONS
test edi, 2
mov BYTE PTR [edx+10], al
mov eax, 122
je .L7
.L25:
movzx edx, WORD PTR [esi]
add edi, 2
add esi, 2
sub eax, 2
mov WORD PTR [edi-2], dx
jmp .L7
.LFE3:
Universal memory copy functions usually work as follows: calculate how many 32-bit wordscan be copied,
then copy them usingMOVSD, then copy the remaining bytes.
More advanced and complex copy functions useSIMDinstructions and also take the memory alignment
in consideration.
As an example of SIMD strlen() function:1.29.2 on page 416.
memcmp()
Listing 3.50: memcmp() exampleint memcmp_1235(char buf1, char buf2)
{
return memcmp(buf1, buf2, 1235);
};
For any block size, MSVC 2013 inserts the same universal function:
Listing 3.51: Optimizing MSVC 2010_buf1$ = 8 ; size = 4
_buf2$ = 12 ; size = 4
_memcmp_1235 PROC
mov ecx, DWORD PTR _buf1$[esp-4]
mov edx, DWORD PTR _buf2$[esp-4]
push esi
mov esi, 1231
npad 2
$LL5@memcmp_123:
mov eax, DWORD PTR [ecx]
cmp eax, DWORD PTR [edx]
jne SHORT $LN4@memcmp_123
add ecx, 4
add edx, 4
sub esi, 4
jae SHORT $LL5@memcmp_123
$LN4@memcmp_123:
mov al, BYTE PTR [ecx]
cmp al, BYTE PTR [edx]
jne SHORT $LN6@memcmp_123
mov al, BYTE PTR [ecx+1]
cmp al, BYTE PTR [edx+1]
jne SHORT $LN6@memcmp_123
mov al, BYTE PTR [ecx+2]
cmp al, BYTE PTR [edx+2]
jne SHORT $LN6@memcmp_123
cmp esi, -1
je SHORT $LN3@memcmp_123
mov al, BYTE PTR [ecx+3]
cmp al, BYTE PTR [edx+3]
jne SHORT $LN6@memcmp_123
$LN3@memcmp_123:
xor eax, eax
pop esi
ret 0
