Reversing : The Hacker's Guide to Reverse Engineering

managed in the program. This requires two perspectives: the high-level perspective as viewed by software developers and the low-level perspective that is viewed by reversers. High-level languages tend to isolate software developers from the details surrounding data management at the system level. Developers are usually only made aware of the simplified data flow described by the high-level language. Naturally, most reversers are interested in obtaining a view of the program that matches that simplified high-level view as closely as possible. That’s because the high-level perspective is usually far more human-friendly than the machine’s perspective. Unfortunately, most programming languages and software development platforms strip (or mangle) much of that human-readable information from binaries shipped to end users. In order to be able to recover some or all of that high-level data flow information from a program binary, you must understand how programs view and treat data from both the programmer’s high-level perspective and the low- level machine-generated code. The following sections take us through a brief overview of high-level data constructs such as variables and the most common types of data structures.

Variables

For a software developer, the key to managing and storing data is usually named variables. All high-level languages provide developers with the means to declare variables at various scopes and use them to store information. Programming languages provide several abstractions for these variables. The level at which variables are defined determines which parts of the program will be able to access it, and also where it will be physically stored. The names of named variables are usually relevant only during compilation. Many compilers completely strip the names of variables from a program’s binaries and identify them using their address in memory. Whether or not this is done depends on the target platform for which the program is being built.

User-Defined Data Structures

User-defined data structures are simple constructs that represent a group of data fields, each with its own type. The idea is that these fields are all somehow related, which is why the program stores and handles them as a single unit. The data types of the specific fields inside a data structure can either be simple data types such as integers or pointers or they can be other data structures. While reversing, you’ll be encountering a variety of user-defined data structures. Properly identifying such data structures and deciphering their contents is critical for achieving program comprehension. The key to doing this suc- cessfully is to gradually record every tiny detail discovered about them until

30 Chapter 2

Reversing : The Hacker's Guide to Reverse Engineering

Variables

User-Defined Data Structures

Get our desktop app

Company

Features

Documentation

Resources