Assembly Language for Beginners

(nextflipdebug2) #1

11.5 Itanium.


fcomp [esp+50h+var_3C]
fnstsw ax
test ah, 41h
jz short loc_100040B7

The firstFXCHinstruction swapsST(0)andST(1), the second do the same, so both do nothing. This is a
program uses MFC42.dll, so it could be MSVC 6.0, 5.0 or maybe even MSVC 4.2 from 1990s.


This pair do nothing, so it probably wasn’t caught by MSVC compiler tests. Or maybe I wrong?


11.4.3 Summary..


Othercompileranomalieshereinthisbook:1.22.2onpage315,3.7.3onpage493,3.15.7onpage532,1.20.7
on page 302,1.14.4 on page 147,1.22.5 on page 332.


Such cases are demonstrated here in this book, to show that such compilers errors are possible and
sometimes one should not to rack one’s brain while thinking why did the compiler generate such strange
code.


11.5 Itanium


Although almost failed, Intel Itanium (IA64) is a very interesting architecture.


WhileOOECPUs decides how to rearrange their instructions and execute them in parallel,EPIC^2 was an
attempt to shift these decisions to the compiler: to let it group the instructions at the compile stage.


This resulted in notoriously complex compilers.


Here is one sample ofIA64code: simple cryptographic algorithm from the Linux kernel:


Listing 11.3: Linux kernel 3.2.0.4

#define TEA_ROUNDS 32
#define TEA_DELTA 0x9e3779b9


static void tea_encrypt(struct crypto_tfm tfm, u8 dst, const u8 src)
{
u32 y, z, n, sum = 0;
u32 k0, k1, k2, k3;
struct tea_ctx
ctx = crypto_tfm_ctx(tfm);
const le32 *in = (const le32 )src;
__le32
out = (__le32 *)dst;


y = le32_to_cpu(in[0]);
z = le32_to_cpu(in[1]);

k0 = ctx->KEY[0];
k1 = ctx->KEY[1];
k2 = ctx->KEY[2];
k3 = ctx->KEY[3];

n = TEA_ROUNDS;

while (n-- > 0) {
sum += TEA_DELTA;
y += ((z << 4) + k0) ^ (z + sum) ^ ((z >> 5) + k1);
z += ((y << 4) + k2) ^ (y + sum) ^ ((y >> 5) + k3);
}

out[0] = cpu_to_le32(y);
out[1] = cpu_to_le32(z);
}


Here is how it was compiled:


(^2) Explicitly Parallel Instruction Computing

Free download pdf