Reversing : The Hacker's Guide to Reverse Engineering

(ff) #1
loaded at a different virtual address each time they are loaded (but they can
never be relocated after they have been loaded). Relocation happens because
an executable does not exist in a vacuum—it must coexist with other executa-
bles that are loaded in the same address space. Sure, modern operating sys-
tems provide each process with its own address space, but there are many
executables that are loaded into each address space. Other than the main exe-
cutable (that’s the .exe file you launch when you run a program), every pro-
gram has a certain number of additional executables loaded into its address
space, regardless of whether it has DLLs of its own or not. The operating sys-
tem loads quite a few DLLs into each program’s address space—it all depends
on which OS features are required by the program.
Because multiple executables are loaded into each address space, we effec-
tively have a mix of executables in each address space that wasn’t necessarily
preplanned. Therefore, it’s likely that two or more modules will try to use the
same memory address, which is not going to work. The solution is to relocateone
of these modules while it’s being loaded and simply load it in a different address
than the one it was originally planned to be loaded at. At this point you may be
wondering why an executable even needs to know in advance where it will be
loaded? Can’t it be like any regular file and just be loaded wherever there’s
room? The problem is that an executable contains many cross-references, where
one position in the code is pointing at another position in the code. Consider,
for example, the sequence that accesses a global variable.

MOV EAX, DWORD PTR [pGlobalVariable]

The preceding instruction is a typical global variable access. The storage for
such a global variable is stored inside the executable image (because many
variables have a preinitialized value). The question is, what address should
the compiler and linker write as the address to pGlobalVariablewhile gen-
erating the executable? Usually, you would just write a relative address—an
address that’s relative to the beginning of the file. This way you wouldn’t have
to worry about where the file gets loaded. The problem is this is a code
sequence that gets executed directly by the processor. You could theoretically
generate logic that would calculate the exact address by adding the relative
address to the base address where the executable is currently mapped, but that
would incur a significant performance penalty. Instead, the loader just goes
over the code and modifies all absolute addresses within it to make sure that
they point to the right place.
Instead of going through this process every time a module is loaded, each
module is assigned a base address while it is being created. The linker then
assumes that the executable is going to be loaded at the base address—if it
does, no relocation will take place. If the module’s base address is already
taken, the module is relocated.

94 Chapter 3

Free download pdf