Reversing : The Hacker's Guide to Reverse Engineering

(ff) #1
Because the code generator is responsible for the actual selection of specific
assembly language instructions, it is usually the only component that has
enough information to apply any significant platform-specific optimizations.
This is important because many of the transformations that make compiler-
generated assembly language code difficult to read take place at this stage.
The following are the three of the most important stages (at least from our
perspective) that take place during the code generation process:
■■ Instruction selection: This is where the code from the intermediate rep-
resentation is translated into platform-specific instructions. The selec-
tion of each individual instruction is very important to overall program
performance and requires that the compiler be aware of the various
properties of each instruction.
■■ Register allocation: In many intermediate representations there is an
unlimited number of registers available, so that every local variable can
be placed in a register. The fact that the target processor has a limited
number of registers comes into play during code generation, when the
compiler must decide which variable gets placed in which register, and
which variable must be placed on the stack.
■■ Instruction scheduling: Because most modern processors can handle
multiple instructions at once, data dependencies between individual
instructions become an issue. This means that if an instruction performs
an operation and stores the result in a register, immediately reading
from that register in the following instruction would cause a delay,
because the result of the first operation might not be available yet. For
this reason the code generator employs platform-specific instruction
scheduling algorithms that reorder instructions to try to achieve the
highest possible level of parallelism. The end result is interleaved code,
where two instruction sequences dealing with two separate things are
interleaved to create one sequence of instructions. We will be seeing
such sequences in many of the reversing sessions in this book.

Listing Files


A listing file is a compiler-generated text file that contains the assembly lan-
guage code produced by the compiler. It is true that this information can be
obtained by disassembling the binaries produced by the compiler, but a listing
file also conveniently shows how each assembly language line maps to the
original source code. Listing files are not strictly a reversing tool but more of a
research tool used when trying to study the behavior of a specific compiler by
feeding it different code and observing the output through the listing file.

58 Chapter 2

Free download pdf