Polymorphism
The easiest way for antivirus programs to identify malicious programs is by
using unique signatures. The antivirus program maintains a frequently updated
database of virus signatures, which aims to contain a unique identification for
every known malware program. This identification is based on a unique
sequence that was found in a particular strand of the malicious program.
Polymorphism is a technique that thwarts signature-based identification
programs by randomly encoding or encrypting the program code in a way
that maintains its original functionality. The simplest approach to polymor-
phism is based on encrypting the program using a random key and decrypt-
ing it at runtime. Depending on when an antivirus program scans the program
for its signature, this might prevent accurate identification of a malicious pro-
gram because each copy of it is entirely different (because it is encrypted using
a random encryption key).
There are two significant weaknesses with these kinds of solutions. First of
all, many antivirus programs might scan for virus signatures in memory.
Because in most cases the program is going to be present in memory in its orig-
inal, unencrypted form, the antivirus program won’t have a problem matching
the running program with the signature it has on file. The second weakness
lies in the decryption code itself. Even if an antivirus program only uses on-
disk files in order to match malware signatures, there is still the problem of the
decryption code being static. For the program to actually be able to run, it must
decrypt itself in memory, and it is this decryption code that could theoretically
be used as the signature.
The solution to these problems generally revolves around rotating or scram-
bling certain elements in the decryption code (or in the entire program) in
ways that alter its signature yet preserve its original functionality. Consider
the following sequence as an example:
0040343B 8B45 CC MOV EAX,[EBP-34]
0040343E 8B00 MOV EAX,[EAX]
00403440 3345 D8 XOR EAX,[EBP-28]
00403443 8B4D CC MOV ECX,[EBP-34]
00403446 8901 MOV [ECX],EAX
00403448 8B45 D4 MOV EAX,[EBP-2C]
0040344B 8945 D8 MOV [EBP-28],EAX
0040344E 8B45 DC MOV EAX,[EBP-24]
00403451 3345 D4 XOR EAX,[EBP-2C]
00403454 8945 DC MOV [EBP-24],EAX
One almost trivial method that would make it a bit more difficult to identify
this sequence would consist of simply randomizing the use of registers in the
code. The code sequence uses registers separately at several different phases.
282 Chapter 8