index, but not before it multiplies it by 8. This line essentially takes ESI, which
was an index to the current file entry, and multiplies it by 19 8 = 152. Sounds
familiar doesn’t it? You’re right: 152 is the file entry length. By computing
[ECX+EAX8+8], Cryptex is obtaining the value of offset +8 at the current file
entry.
We already know that offset +8 contains the file size in clusters, and this
value is being sent back to the caller using a parameter that was passed in to
receive this value. Cryptex needs the file size in order to extract the file. After
loading the file size, Cryptex checks for what is apparently another output
parameter that is supposed to receive additional output data from this func-
tion, this time at [ESP+28]. If it is nonzero, Cryptex copies the value from off-
set +C at the file entry into the pointer that was passed and proceeds to copy
offset +10 into offset +4 in the pointer that was passed, and so on, until a total
of four DWORDs, or 16 bytes are copied. As a reminder, those 16 bytes are the
ones that looked like junk when you dumped the file list earlier. Before return-
ing to the caller, the function loads offset +4 at the current file entry and sets
that into EAX—it is returning it to the caller.
To summarize, this sequence scans the file list looking for a specific file name,
and once that entry is found it returns three individual items to the caller. The
file size in clusters, an unknown, seemingly random 16-byte sequence, and
another unknown DWORD from offset +4 in the file entry. Let’s proceed to see
how this data is used by the file extraction routine.
Decrypting the File
After returning from 004017B0, Cryptex proceeds to scan the supplied file
name for backslashes and loops until the last backslash is encountered. The
actual scanning is performed using the C runtime library function strchr,
which simply returns the address of the first instance of the character, if one is
found. The address that points to the last backslash is stored in [ESP+20]; this
is essentially the “clean” version of the file name without any path informa-
tion. One instruction that draws attention in this otherwise trivial sequence is
the one at 00401C9E.
00401C9E MOV EDI,EDI
You might recall that we’ve already seen a similar instruction in the previ-
ous chapter. In that case, it was used as an infrastructure to allow people to
trap system APIs in Windows. This case is not relevant here, so why would the
compiler insert an instruction that does nothing into the middle of a function?
The answer is simple. The address in which this instruction begins is
unaligned, which means that it doesn’t start on a 32-bit boundary. Executing
unaligned instructions (or accessing unaligned memory addresses in general)
Deciphering File Formats 235