Reversing : The Hacker's Guide to Reverse Engineering

loaded at a different virtual address each time they are loaded (but they can never be relocated after they have been loaded). Relocation happens because an executable does not exist in a vacuum—it must coexist with other executables that are loaded in the same address space. Sure, modern operating sys- tems provide each process with its own address space, but there are many executables that are loaded into each address space. Other than the main executable (that’s the .exe file you launch when you run a program), every program has a certain number of additional executables loaded into its address space, regardless of whether it has DLLs of its own or not. The operating sys- tem loads quite a few DLLs into each program’s address space—it all depends on which OS features are required by the program. Because multiple executables are loaded into each address space, we effec- tively have a mix of executables in each address space that wasn’t necessarily preplanned. Therefore, it’s likely that two or more modules will try to use the same memory address, which is not going to work. The solution is to relocateone of these modules while it’s being loaded and simply load it in a different address than the one it was originally planned to be loaded at. At this point you may be wondering why an executable even needs to know in advance where it will be loaded? Can’t it be like any regular file and just be loaded wherever there’s room? The problem is that an executable contains many cross-references, where one position in the code is pointing at another position in the code. Consider, for example, the sequence that accesses a global variable.

MOV EAX, DWORD PTR [pGlobalVariable]

The preceding instruction is a typical global variable access. The storage for such a global variable is stored inside the executable image (because many variables have a preinitialized value). The question is, what address should the compiler and linker write as the address to pGlobalVariablewhile gen- erating the executable? Usually, you would just write a relative address—an address that’s relative to the beginning of the file. This way you wouldn’t have to worry about where the file gets loaded. The problem is this is a code sequence that gets executed directly by the processor. You could theoretically generate logic that would calculate the exact address by adding the relative address to the base address where the executable is currently mapped, but that would incur a significant performance penalty. Instead, the loader just goes over the code and modifies all absolute addresses within it to make sure that they point to the right place. Instead of going through this process every time a module is loaded, each module is assigned a base address while it is being created. The linker then assumes that the executable is going to be loaded at the base address—if it does, no relocation will take place. If the module’s base address is already taken, the module is relocated.

94 Chapter 3

Reversing : The Hacker's Guide to Reverse Engineering

Get our desktop app

Company

Features

Documentation

Resources