The disadvantage of all of these tricks is that they count on the disassembler
being relatively dumb. Luckily, most Windows disassemblers are dumb
enough that you can fool them. What would happen if you ran into a clever
disassembler that actually analyzes each line of code and traces the flow of
data? Such a disassembler would not fall for any of these tricks, because it
would detect your opaque predicate; how difficult is it to figure out that a con-
ditional jump that is taken when 2 equals 3 is never actually going to be taken?
Moreover, a simple data-flow analysis would expose the fact that the final JMP
sequence is essentially equivalent to a JMP After, which would probably be
enough to correct the disassembly anyhow.
Still even a cleverer disassembler could be easily fooled by exporting the
real jump addresses into a central, runtime generated data structure. It would
be borderline impossible to perform a global data-flow analysis so compre-
hensive that it would be able to find the real addresses without actually run-
ning the program.
Applications
Let’s see how one would use the previous techniques in a real program. I’ve
created a simple macro called OBFUSCATE, which adds a little assembly lan-
guage sequence to a C program (see Listing 10.1). This sequence would tem-
porarily confuse most disassemblers until they resynchronized. The number
of instructions it will take to resynchronize depends not only on the specific
disassembler used, but also on the specific code that comes after the macro.
#define paste(a, b) a##b
#define pastesymbols(a, b) paste(a, b)
#define OBFUSCATE() \
_asm { mov eax, __LINE__ * 0x635186f1 };\
_asm { cmp eax, __LINE__ * 0x9cb16d48 };\
_asm { je pastesymbols(Junk,__LINE__) };\
_asm { mov eax, pastesymbols(After, __LINE__) };\
_asm { jmp eax };\
_asm { pastesymbols(Junk, __LINE__): };\
_asm { _emit (0xd8 + __LINE__ % 8) };\
_asm { pastesymbols(After, __LINE__): };
Listing 10.1 A simple code obfuscation macro that aims at confusing disassemblers.
This macro was tested on the Microsoft C/C++ compiler (version 13), and
contains pseudorandom values to make it slightly more difficult to search and
replace (the MOVand CMPinstructions and the junk byte itself are all random,
calculated using the current code line number). Notice that the junk byte
ranges from D8to DF—these are good opcodes to use because they are all
Antireversing Techniques 343