Reverse Engineering for Beginners

(avery) #1

CHAPTER 43. INLINE FUNCTIONS CHAPTER 43. INLINE FUNCTIONS


Listing 43.10: 32 bytes

#include <stdio.h>


void f(char *out)
{
memset(out, 0, 32);
};


Many compilers don’t generate a call to memset() for short blocks, but rather insert a pack ofMOVs:


Listing 43.11: Optimizing GCC 4.9.1 x64

f:
mov QWORD PTR [rdi], 0
mov QWORD PTR [rdi+8], 0
mov QWORD PTR [rdi+16], 0
mov QWORD PTR [rdi+24], 0
ret


By the way, that remind us of unrolled loops:14.1.4 on page 180.


Example#2


Listing 43.12: 67 bytes

#include <stdio.h>


void f(char *out)
{
memset(out, 0, 67);
};


When the block size is not a multiple of 4 or 8, the compilers can behave differently.


For instance, MSVC 2012 continues to insertMOVs:


Listing 43.13: Optimizing MSVC 2012 x64

out$ = 8
f PROC
xor eax, eax
mov QWORD PTR [rcx], rax
mov QWORD PTR [rcx+8], rax
mov QWORD PTR [rcx+16], rax
mov QWORD PTR [rcx+24], rax
mov QWORD PTR [rcx+32], rax
mov QWORD PTR [rcx+40], rax
mov QWORD PTR [rcx+48], rax
mov QWORD PTR [rcx+56], rax
mov WORD PTR [rcx+64], ax
mov BYTE PTR [rcx+66], al
ret 0
f ENDP


...while GCC usesREP STOSQ, concluding that this would be shorter than a pack ofMOVs:


Listing 43.14: Optimizing GCC 4.9.1 x64

f:
mov QWORD PTR [rdi], 0
mov QWORD PTR [rdi+59], 0
mov rcx, rdi
lea rdi, [rdi+8]
xor eax, eax
and rdi, -8
sub rcx, rdi
add ecx, 67
shr ecx, 3
rep stosq
ret

Free download pdf