Reversing : The Hacker's Guide to Reverse Engineering

(ff) #1
Java

Java is an object-oriented, high-level language that is different from other lan-
guages such as C and C++ because it is not compiled into any native proces-
sor’s assembly language, but into the Java bytecode. Briefly, the Java instruction
set and bytecode are like a Java assembly language of sorts, with the difference
that this language is not usually interpreted directly by the hardware, but is
instead interpreted by software (the Java Virtual Machine).
Java’s primary strength is the ability to allow a program’s binary to run on
any platform for which the Java Virtual Machine (JVM) is available.
Because Java programs run on a virtual machine (VM), the process of
reversing a Java program is completely different from reversing programs
written in compiler-based languages such as C and C++. Java executables
don’t use the operating system’s standard executable format (because they are
not executed directly on the system’s CPU). Instead they use .class files, which
are loaded directly by the virtual machine.
The Java bytecode is far more detailed compared to a native processor
machine code such as IA-32, which makes decompilation a far more viable
option. Java classes can often be decompiled with a very high level of accuracy,
so that the process of reversing Java classes is usually much simpler than with
native code because it boils down to reading a source-code-level representa-
tion of the program. Sure, it is still challenging to comprehend a program’s
undocumented source code, but it is far easier compared to starting with a
low-level assembly language representation.

C

C# was developed by Microsoft as a Java-like object-oriented language that
aims to overcome many of the problems inherent in C++. C# was introduced
as part of Microsoft’s .NET development platform, and (like Java and quite a
few other languages) is based on the concept of using a virtual machine for
executing programs.
C# programs are compiled into an intermediate bytecode format (similar to
the Java bytecode) called the Microsoft Intermediate Language (MSIL). MSIL
programs run on top of the common language runtime (CLR), which is essen-
tially the .NET virtual machine. The CLR can be ported into any platform,
which means that .NET programs are not bound to Windows—they could be
executed on other platforms.
C# has quite a few advanced features such as garbage collection and type
safety that are implemented by the CLR. C# also has a special unmanagedmode
that enables direct pointer manipulation.
As with Java, reversing C# programs sometimes requires that you learn the
native language of the CLR—MSIL. On the other hand, in many cases manu-
ally reading MSIL code will be unnecessary because MSIL code contains

36 Chapter 2

Free download pdf