only be accessed in one or two places. To completely and fully understand the
entire file format, you might actually have to reverse every single line of code
in the program. Cryptex is a tiny program, so this might actually be feasible,
but in most cases it won’t be.
So, what do you do with those missing details that you didn’t catch during
your intensive reversing session? One primitive, yet effective, approach is to
simply let the program update the file and observe changes using a binary file-
comparison program (Hex Workshop has this feature). One specific problem
you might have with Cryptex is that files are encrypted. It is likely that a sin-
gle-byte difference in the plaintext would completely alter the cipher text that
is written into the file. One solution is to write a program that decrypts Cryp-
tex archives so that you can more accurately study their layout. This way you
would be easily able to compare two different versions of the same Cryptex
archive and determine precisely what the changes are and what they expose
about those unknown fields. This approach of observing the changes made
to a file by the program that owns it is quite useful in data reverse engineer-
ing and when combined with clever code-level analysis can usually produce
extremely accurate results.
Conclusion
In this chapter, you have learned how to use reversing techniques to dig into
undocumented program data such as proprietary file formats or network proto-
cols to reach a point at which you can write code that deciphers such data or
even code that generates compatible data. Deciphering a file format is not as dif-
ferent from conventional code-level reversing as you might expect. As demon-
strated in this chapter, code-level reversing can, in many cases, provide almost
all the answers regarding a program’s data format and how it is structured.
Granted, Cryptex maintains a relatively simple file format. In many real-
world reversing scenarios you might run into file formats that employ a far
more complex structure. Still, the basic approach is the same: By combining
code-level reversing techniques with the process of observing the data modifi-
cations performed by the owning program while specific test cases are fed to
it, you can get a pretty good grip on most file formats and other types of pro-
prietary data.
242 Chapter 6