Confusing Decompilers
Because bytecode-based languages are highly detailed, there are numerous
decompilers that are highly effective for decompiling bytecode executables.
One of the primary design goals of most bytecode obfuscators is to confuse
decompilers, so that the code cannot be easily restored to a highly detailed
source code. One trick that does wonders is to modify the program binary so
that the bytecode contains statements that cannot be translated back into the
original high-level language. The example given in A Taxonomy of Obfuscating
Transformationsby Christian Collberg, Clark Thomborson, and Douglas Low
[Collberg2] is the Java programming language, where the high-level language
does not have the gotostatement, but the Java bytecode does. This means that
its possible to add gotostatements into the bytecode in order to completely
break the program’s flow graph, so that a decompiler cannot later reconstruct
it (because it contains instructions that cannot be translated back to Java).
In native processor languages such as IA-32 machine code, decompilation is
such a complex and fragile process that any kind of obfuscation transforma-
tion could easily get them to fail or produce meaningless code. Consider, for
example, what would happen if a decompiler ran into the OBFUSCATEmacro
from the previous section.
Table Interpretation
Converting a program or a function into a table interpretation layout is a
highly powerful obfuscation approach, that if done right can repel both deob-
fuscators and human reversers. The idea is to break a code sequence into mul-
tiple short chunks and have the code loop through a conditional code
sequence that decides to which of the code sequences to jump at any given
moment. This dramatically reduces the readability of the code because it com-
pletely hides any kind of structure within it. Any code structures, such as log-
ical statements or loops, are buried inside this unintuitive structure.
As an example, consider the simple data processing function in Listing 10.2.
00401000 push esi
00401001 push edi
00401002 mov edi,dword ptr [esp+10h]
00401006 xor eax,eax
00401008 xor esi,esi
0040100A cmp edi,3
0040100D jbe 0040103A
0040100F mov edx,dword ptr [esp+0Ch]
00401013 add edi,0FFFFFFFCh
00401016 push ebx
Listing 10.2 A simple data processing function that XORs a data block with a parameter
passed to it and writes the result back into the data block.
348 Chapter 10