Reversing : The Hacker's Guide to Reverse Engineering

(ff) #1

language is thelanguage of reversing. To master the world of reversing, one
must develop a solid understanding of the chosen platform’s assembly lan-
guage. Which bring us to the most basic point to remember about assembly lan-
guage: it is a class of languages, not one language. Every computer platform
has its own assembly language that is usually quite different from all the rest.
Another important concept to get out of the way is machine code (often called
binary code, orobject code). People sometimes make the mistake of thinking that
machine code is “faster” or “lower-level” than assembly language. That is a
misconception: machine code and assembly language are two different repre-
sentations of the same thing. A CPU reads machine code, which is nothing but
sequences of bits that contain a list of instructions for the CPU to perform.
Assembly language is simply a textual representation of those bits—we name
elements in these code sequences in order to make them human-readable.
Instead of cryptic hexadecimal numbers we can look at textual instruction
names such as MOV(Move), XCHG(Exchange), and so on.
Each assembly language command is represented by a number, called the
operation code, or opcode. Object code is essentially a sequence of opcodes and
other numbers used in connection with the opcodes to perform operations.
CPUs constantly read object code from memory, decode it, and act based on
the instructions embedded in it. When developers write code in assembly lan-
guage (a fairly rare occurrence these days), they use an assembler program to
translate the textual assembly language code into binary code, which can be
decoded by a CPU. In the other direction and more relevant to our narrative, a
disassembler does the exact opposite. It reads object code and generates the tex-
tual mapping of each instruction in it. This is a relatively simple operation to
perform because the textual assembly language is simply a different represen-
tation of the object code. Disassemblers are a key tool for reversers and are dis-
cussed in more depth later in this chapter.
Because assembly language is a platform-specific affair, we need to choose a
specific platform to focus on while studying the language and practicing
reversing. I’ve decided to focus on the Intel IA-32 architecture, on which every
32-bit PC is based. This choice is an easy one to make, considering the popu-
larity of PCs and of this architecture. IA-32 is one of the most common CPU
architectures in the world, and if you’re planning on learning reversing and
assembly language and have no specific platform in mind, go with IA-32. The
architecture and assembly language of IA-32-based CPUs are introduced in
Chapter 2.


Compilers


So, considering that the CPU can only run machine code, how are the popular
programming languages such as C++ and Java translated into machine code?
A text file containing instructions that describe the program in a high-level
language is fed into a compiler. A compiler is a program that takes a source file


Foundations 11
Free download pdf