Reverse Engineering for Beginners

(avery) #1

CHAPTER 14. LOOPS CHAPTER 14. LOOPS


What happens here is that space for theivariable is not allocated in the local stack anymore, but uses an individual register
for it,ESI. This is possible in such small functions where there aren’t many local variables.


One very important thing is that thef()function must not change the value inESI. Our compiler is sure here. And if the
compiler decides to use theESIregister inf()too, its value would have to be saved at the function’s prologue and restored
at the function’s epilogue, almost like in our listing: please notePUSH ESI/POP ESIat the function start and end.


Let’s try GCC 4.4.1 with maximal optimization turned on (-O3option):


Listing 14.4: Optimizing GCC 4.4.1

main proc near


var_10 = dword ptr -10h


push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov [esp+10h+var_10], 2
call printing_function
mov [esp+10h+var_10], 3
call printing_function
mov [esp+10h+var_10], 4
call printing_function
mov [esp+10h+var_10], 5
call printing_function
mov [esp+10h+var_10], 6
call printing_function
mov [esp+10h+var_10], 7
call printing_function
mov [esp+10h+var_10], 8
call printing_function
mov [esp+10h+var_10], 9
call printing_function
xor eax, eax
leave
retn
main endp


Huh, GCC just unwound our loop.


Loop unwindinghas an advantage in the cases when there aren’t much iterations and we could cut some execution time by
removing all loop support instructions. On the other side, the resulting code is obviously larger.


Big unrolled loops are not recommended in modern times, because bigger functions may require bigger cache footprint^1.


OK, let’s increase the maximum value of theivariable to 100 and try again. GCC does:


Listing 14.5: GCC
public main
main proc near


var_20 = dword ptr -20h


push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
push ebx
mov ebx, 2 ; i=2
sub esp, 1Ch

; aligning label loc_80484D0 (loop body begin) by 16-byte border:
nop


loc_80484D0:
; pass (i) as first argument to printing_function():
mov [esp+20h+var_20], ebx
add ebx, 1 ; i++
call printing_function


(^1) A very good article about it: [Dre07]. Another recommendations about loop unrolling from Intel are here : [Int14, p. 3.4.1.7].

Free download pdf