Reversing : The Hacker's Guide to Reverse Engineering

couldn’t directly update the EFLAGSregister (nowadays this is easy, because the two units are implemented on a single chip). This meant that the result of a floating-point comparison was written into a separate floating-point status register, which then had to be loaded into one of the general-purpose registers, and from there it was possible to test its value and perform a conditional branch. Let’s look at an example.

00401000 FLD DWORD PTR [ESP+4] 00401004 FCOMP DWORD PTR [ESP+8] 00401008 FSTSW AX 0040100A TEST AH,41 0040100D JNZ SHORT 0040101D

This snippet loads one floating-point value into the floating-point stack (essentially like a floating-point register), and compares another value against the first value. Because the older FCOMPinstruction is used, the result is stored in the floating-point status word. If the code were to use the newer FCOMIP instruction, the outcome would be written directly into EFLAGS, but this is a newer instruction that didn’t exist in older versions of the processor. Because the result is stored in the floating-point status word, you need to somehow get it out of there in order to test the result of the comparison and perform a conditional branch. This is done using the FSTSWinstruction, which copies the floating-point status word into the AXregister. Once that value is in AX, you can test the specific flags and perform the conditional branch. The bottom line of all of this is that to translate this sequence into the decompiler’s intermediate representation (which is not supposed to contain any architecture-specific details), the front end must “understand” this sequence for what it is, and eliminate the code that tests for specific flags (the constant 0x41) and so on. This is usually implemented by adding specific code in the front end that knows how to decipher these types of sequences.

Generating Control Flow Graphs

The code generated by a decompiler’s front end is represented in a graph structure, where each code block is called a basic block (BB). This graph structure simply represents the control flow instructions present in the low-level machine code. Each BB ends with a control flow instruction such as a branch instruction, a call, or a ret, or with a label that is referenced by some branch instruction elsewhere in the code (because labels represent a control flow join). Blocks are defined for each code segment that is referenced elsewhere in the code, typically by a branch instruction. Additionally, a BB is created after every conditional branch instruction, so that a conditional branch instruction

464 Chapter 13

Reversing : The Hacker's Guide to Reverse Engineering

Generating Control Flow Graphs

Get our desktop app

Company

Features

Documentation

Resources