Reversing : The Hacker's Guide to Reverse Engineering

(ff) #1
managed in the program. This requires two perspectives: the high-level per-
spective as viewed by software developers and the low-level perspective that
is viewed by reversers.
High-level languages tend to isolate software developers from the details
surrounding data management at the system level. Developers are usually only
made aware of the simplified data flow described by the high-level language.
Naturally, most reversers are interested in obtaining a view of the program
that matches that simplified high-level view as closely as possible. That’s
because the high-level perspective is usually far more human-friendly than the
machine’s perspective. Unfortunately, most programming languages and soft-
ware development platforms strip (or mangle) much of that human-readable
information from binaries shipped to end users.
In order to be able to recover some or all of that high-level data flow infor-
mation from a program binary, you must understand how programs view and
treat data from both the programmer’s high-level perspective and the low-
level machine-generated code. The following sections take us through a brief
overview of high-level data constructs such as variables and the most common
types of data structures.

Variables

For a software developer, the key to managing and storing data is usually
named variables. All high-level languages provide developers with the means
to declare variables at various scopes and use them to store information.
Programming languages provide several abstractions for these variables.
The level at which variables are defined determines which parts of the pro-
gram will be able to access it, and also where it will be physically stored. The
names of named variables are usually relevant only during compilation. Many
compilers completely strip the names of variables from a program’s binaries
and identify them using their address in memory. Whether or not this is done
depends on the target platform for which the program is being built.

User-Defined Data Structures

User-defined data structures are simple constructs that represent a group of
data fields, each with its own type. The idea is that these fields are all somehow
related, which is why the program stores and handles them as a single unit. The
data types of the specific fields inside a data structure can either be simple data
types such as integers or pointers or they can be other data structures.
While reversing, you’ll be encountering a variety of user-defined data struc-
tures. Properly identifying such data structures and deciphering their contents
is critical for achieving program comprehension. The key to doing this suc-
cessfully is to gradually record every tiny detail discovered about them until

30 Chapter 2

Free download pdf