Assembly Language for Beginners

(nextflipdebug2) #1

1.30 64 bits.


pmovmskb eax, xmm1


This instruction sets firstEAXbit to 1 if the most significant bit of the first byte inXMM1is 1. In other words,
if the first byte of theXMM1register is0xff, then the first bit ofEAXis to be 1, too.


If the second byte in theXMM1register is0xff, then the second bit inEAXis to be set to 1. In other words,
the instruction is answering the question “which bytes inXMM1has the most significant bit set, or greater
than 0x7f”, and returns 16 bits in theEAXregister. The other bits in theEAXregister are to be cleared.


By the way, do not forget about this quirk of our algorithm. There might be 16 bytes in the input like:
15 14 13 12 11 10 9 3 2 1 0


’h’ ’e’ ’l’ ’l’ ’o’ 0 garbage 0 garbage

It is the'hello'string, terminating zero, and some random noise in memory.


If we load these 16 bytes intoXMM1and compare them with the zeroedXMM0, we are getting something
like^187 :


XMM1: 0x0000ff00000000000000ff0000000000


This means that the instruction found two zero bytes, and it is not surprising.


PMOVMSKBin our case will setEAXto
0b0010000000100000.


Obviously, our function must take only the first zero bit and ignore the rest.


The next instruction isBSF(Bit Scan Forward).


This instruction finds the first bit set to 1 and stores its position into the first operand.


EAX=0b0010000000100000


After the execution ofbsf eax, eax,EAXcontains 5, meaning 1 has been found at the 5th bit position
(starting from zero).


MSVC has a macro for this instruction:_BitScanForward.


Now it is simple. If a zero byte has been found, its position is added to what we have already counted and
now we have the return result.


Almost all.


By the way, it is also has to be noted that the MSVC compiler emitted two loop bodies side by side, for
optimization.


By the way, SSE 4.2 (that appeared in Intel Core i7) offers more instructions where these string manipula-
tions might be even easier:http://go.yurichev.com/17331


1.30 64 bits


1.30.1 x86-64


It is a 64-bit extension to the x86 architecture.


From the reverse engineer’s perspective, the most important changes are:



  • Almost all registers (except FPU and SIMD) were extended to 64 bits and got a R- prefix. 8 additional
    registers wer added. NowGPR’s are:RAX,RBX,RCX,RDX,RBP,RSP,RSI,RDI,R8,R9,R10,R11,R12,
    R13,R14,R15.


It is still possible to access theolderregister parts as usual. For example, it is possible to access the
lower 32-bit part of theRAXregister usingEAX:

Byte number:
7th 6th 5th 4th 3rd 2nd 1st 0th
RAXx64
EAX
AX
AH AL

(^187) An order fromMSBtoLSB (^188) is used here.

Free download pdf