Reverse Engineering for Beginners

(avery) #1

CHAPTER 3. HELLO, WORLD! CHAPTER 3. HELLO, WORLD!


xor eax, eax
add rsp, 40
ret 0
main ENDP


In x86-64, all registers were extended to 64-bit and now their names have anR-prefix. In order to use the stack less often
(in other words, to access external memory/cache less often), there exists a popular way to pass function arguments via
registers (fastcall)64.3 on page 649. I.e., a part of the function arguments is passed in registers, the rest—via the stack. In
Win64, 4 function arguments are passed in theRCX,RDX,R8,R9registers. That is what we see here: a pointer to the string
forprintf()is now passed not in the stack, but in theRCXregister. The pointers are 64-bit now, so they are passed in the
64-bit registers (which have theR-prefix). However, for backward compatibility, it is still possible to access the 32-bit parts,
using theE-prefix. This is how theRAX/EAX/AX/ALregister looks like in x86-64:


7th(byte number) 6th 5th 4th 3rd 2nd 1st 0th
RAXx64
EAX
AX
AH AL

Themain()function returns anint-typed value, which is, in C/C++, for better backward compatibility and portability, still
32-bit, so that is why theEAXregister is cleared at the function end (i.e., the 32-bit part of the register) instead ofRAX. There
are also 40 bytes allocated in the local stack. This is called the “shadow space”, about which we are going to talk later:8.2.1
on page 91.


3.2.2 GCC—x86-64


Let’s also try GCC in 64-bit Linux:


Listing 3.8: GCC 4.4.6 x64

.string "hello, world\n"
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0 ; "hello, world\n"
xor eax, eax ; number of vector registers passed
call printf
xor eax, eax
add rsp, 8
ret


A method to pass function arguments in registers is also used in Linux, *BSD and Mac OS X is [Mit13].


The first 6 arguments are passed in theRDI,RSI,RDX,RCX,R8,R9registers, and the rest—via the stack.


So the pointer to the string is passed inEDI(the 32-bit part of the register). But why not use the 64-bit part,RDI?


It is important to keep in mind that allMOVinstructions in 64-bit mode that write something into the lower 32-bit register
part also clear the higher 32-bits [Int13]. I.e., theMOV EAX, 011223344hwrites a value intoRAXcorrectly, since the
higher bits will be cleared.


If we open the compiled object file (.o), we can also see all the instructions’ opcodes^9 :


Listing 3.9: GCC 4.4.6 x64

.text:00000000004004D0 main proc near
.text:00000000004004D0 48 83 EC 08 sub rsp, 8
.text:00000000004004D4 BF E8 05 40 00 mov edi, offset format ; "hello, world\n"
.text:00000000004004D9 31 C0 xor eax, eax
.text:00000000004004DB E8 D8 FE FF FF call _printf
.text:00000000004004E0 31 C0 xor eax, eax
.text:00000000004004E2 48 83 C4 08 add rsp, 8
.text:00000000004004E6 C3 retn
.text:00000000004004E6 main endp


As we can see, the instruction that writes intoEDIat0x4004D4occupies 5 bytes. The same instruction writing a 64-bit
value intoRDIoccupies 7 bytes. Apparently, GCC is trying to save some space. Besides, it can be sure that the data segment
containing the string will not be allocated at the addresses higher than 4GiB.


We also see that theEAXregister was cleared before theprintf()function call. This is done because the number of used
vector registers is passed inEAXin *NIX systems on x86-64 ([Mit13]).


(^9) This must be enabled in Options→Disassembly→Number of opcode bytes

Free download pdf