Reversing : The Hacker's Guide to Reverse Engineering

Generally speaking, intermediate representations consist of tiny instruction sets, as opposed to the huge instruction sets of some processor architecture such as IA-32. Tiny instruction sets are possible because of complex expressions used in almost every instruction. The following is a generic description of the instruction set typically used by decompilers. Notice that this example describes a generic instruction set that can be used throughout the decompilation process, so that it can directly represent both a low-level representation that is very similar to the original assembly language code and a high-level representation that can be translated into a high-level language representation. Assignment This is a very generic instruction that represents an assignment operation into a register, variable, or other memory location (such as a global variable). An assignment instruction can typically contain complex expressions on either side. Push Push a value into the stack. Again, the value being pushed can be any kind of complex expression. These instructions are generally elimi- nated during data-flow analysis since they have no direct equivalent in high-level representations. Pop Pop a value from the stack. These instructions are generally elimi- nated during data-flow analysis since they have no direct equivalent in high-level representations. Call Call a subroutine and pass the listed parameters. Each parameter can be represented using a complex expression. Keep in mind that to obtain such a list of parameters, a decompiler would have to perform significant analysis of the low-level code. Ret Return from a subroutine. Typically supports a complex expression to represent the procedure’s return value. Branch A branch instruction evaluates two operands using a specified conditional code and jumps to the specified address if the expression evaluates to True. The comparison is performed on two expression trees, where each tree can represent anything from a trivial expression (such as a constant), to a complex expression. Notice how this is a higher-level representation of what would require several instructions in native assembly language; that’s a good example of how the intermediate representation has the flexibility of showing both an assembly-language- like low-level representation of the code and a higher-level representation that’s closer to a high-level language. Unconditional Jump An unconditional jump is a direct translation of the unconditional jump instruction in the original program. It is used during the construction of the control flow graph. The meanings of unconditional jumps are analyzed during the control flow analysis stage.

460 Chapter 13

Reversing : The Hacker's Guide to Reverse Engineering

Get our desktop app

Company

Features

Documentation

Resources