1.5. HELLO, WORLD!
After the function prologue we see the call to theprintf()function:
CALL _printf. Before the call, a string address (or a pointer to it) containing our greeting is placed on
the stack with the help of thePUSHinstruction.
When theprintf()function returns the control to themain()function, the string address (or a pointer
to it) is still on the stack. Since we do not need it anymore, thestack pointer(theESPregister) needs to
be corrected.
ADD ESP, 4means add 4 to theESPregister value.
Why 4? Since this is a 32-bit program, we need exactly 4 bytes for address passing through the stack. If it
was x64 code we would need 8 bytes.ADD ESP, 4is effectively equivalent toPOP registerbut without
using any register^17.
For the same purpose, some compilers (like the Intel C++ Compiler) may emitPOP ECXinstead ofADD
(e.g., such a pattern can be observed in the Oracle RDBMS code as it is compiled with the Intel C++
compiler). This instruction has almost the same effect but theECXregister contents will be overwritten.
The Intel C++ compiler supposedly usesPOP ECXsince this instruction’s opcode is shorter thanADD ESP,
x(1 byte forPOPagainst 3 forADD).
Here is an example of usingPOPinstead ofADDfrom Oracle RDBMS:
Listing 1.15: Oracle RDBMS 10.2 Linux (app.o file)
.text:0800029A push ebx
.text:0800029B call qksfroChild
.text:080002A0 pop ecx
After callingprintf(), the original C/C++ code contains the statementreturn 0—return 0 as the result
of themain()function.
In the generated code this is implemented by the instructionXOR EAX, EAX.
XORis in fact just “eXclusive OR”^18 but the compilers often use it instead ofMOV EAX, 0—again because
it is a slightly shorter opcode (2 bytes forXORagainst 5 forMOV).
Some compilers emitSUB EAX, EAX, which meansSUBtract the value in theEAXfrom the value inEAX.
That in any case will results in zero.
The last instructionRETreturns the control to thecaller. Usually, this is C/C++CRT^19 code which in turn
returns control to theOS.
GCC
Now let’s try to compile the same C/C++ code in the GCC 4.4.1 compiler in Linux:gcc 1.c -o 1. Next,
with the assistance of theIDAdisassembler, let’s see how themain()function was created. IDA, like
MSVC, uses Intel-syntax^20.
Listing 1.16: code inIDA
main proc near
var_10 = dword ptr -10h
push ebp
mov ebp, esp
and esp, 0FFFFFFF0h
sub esp, 10h
mov eax, offset aHelloWorld ; "hello, world\n"
mov [esp+10h+var_10], eax
call _printf
mov eax, 0
leave
retn
main endp
(^17) CPU flags, however, are modified
(^18) Wikipedia
(^19) C Runtime library
(^20) We could also have GCC produce assembly listings in Intel-syntax by applying the options-S -masm=intel.