locations. The resulting information from this type of analysis can be used for
a number of different things in the decompilation process. It is required for
eliminating the concept of registers and operations performed on individual
registers, and also for introducing the concept of variables and long expres-
sions that are made up of several machine-level instructions. Data-flow analy-
sis is also where conditional codes are eliminated. Conditional codes are easily
decompiled when dealing with simple comparisons, but they can also be used
in other, less obvious ways.
Let’s look at a trivial example where you must use data-flow analysis in
order for the decompiler to truly “understand” what the code is doing. Think
of function return values. It is customary for IA-32 code to use the EAXregister
for passing return values from a procedure to its caller, but a decompiler can-
not necessarily count on that. Different compilers might use different conven-
tions, especially when functions are defined as staticand the compiler
controls all points of entry into the specific function. In such a case, the com-
piler might decide to use some other register for passing the return value. How
does a decompiler know which register is used for passing back return values
and which registers are used for passing parameters into a procedure? This is
exactly the type of problem addressed by data-flow analysis.
Data-flow analysis is performed by defining a special notation that simpli-
fies this process. This notation must conveniently represent the concept of
defining a register, which means that it is loaded with a new value and using a
register, which simply means its value is read. Ideally, such a representation
should also simplify the process of identifying various points in the code
where a register is defined in parallel in two different branches in the control
flow graph.
The next section describes SSA, which is a commonly used notation for
implementing data-flow analysis (in both compilers and decompilers). After
introducing SSA, I proceed to demonstrate areas in the decompilation process
where data-flow analysis is required.
Single Static Assignment (SSA)
Single static assignment(SSA) is a special notation commonly used in compilers
that simplifies many data-flow analysis problems in compilers and can assist
in certain optimizations and register allocation. The idea is to treat each indi-
vidual assignment operation as a different instance of a single variable, so that
xbecomes x0, x1, x2, and so on with each new assignment operation. SSA can
be useful in decompilation because decompilers have to deal with the way
compilers reuse registers within a single procedure. It is very common for pro-
cedures that use a large number of variables to use a single register for two or
more different variables, often containing a different data type.
Decompilation 467