CHAPTER 15. SIMPLE C-STRINGS PROCESSING CHAPTER 15. SIMPLE C-STRINGS PROCESSING
; here we calculate the difference between two pointers
mov eax, DWORD PTR _eos$[ebp]
sub eax, DWORD PTR _str$[ebp]
sub eax, 1 ; subtract 1 and return result
mov esp, ebp
pop ebp
ret 0
strlen ENDP
We get two new instructions here:MOVSXandTEST.
The first one—MOVSX—takes a byte from an address in memory and stores the value in a 32-bit register.MOVSXstands for
MOV with Sign-Extend.MOVSXsets the rest of the bits, from the 8th to the 31th, to 1 if the source byte isnegativeor to 0 if
ispositive.
And here is why.
By default, thechartype is signed in MSVC and GCC. If we have two values of which one ischarand the other isint, (int
is signed too), and if the first value contain -2 (coded as0xFE) and we just copy this byte into theintcontainer, it makes
0x000000FE, and this from the point of signedintview is 254, but not -2. In signed int, -2 is coded as0xFFFFFFFE. So if
we need to transfer0xFEfrom a variable ofchartype toint, we need to identify its sign and extend it. That is whatMOVSX
does.
You can also read about it in “Signed number representations” section (30 on page 431).
It’s hard to say if the compiler needs to store acharvariable inEDX, it could just take a 8-bit register part (for exampleDL).
Apparently, the compiler’sregister allocatorworks like that.
Then we seeTEST EDX, EDX. You can read more about theTESTinstruction in the section about bit fields (19 on page 289).
Here this instruction just checks if the value inEDXequals to 0.
Non-optimizing GCC
Let’s try GCC 4.4.1:
public strlen
strlen proc near
eos = dword ptr -4
arg_0 = dword ptr 8
push ebp
mov ebp, esp
sub esp, 10h
mov eax, [ebp+arg_0]
mov [ebp+eos], eax
loc_80483F0:
mov eax, [ebp+eos]
movzx eax, byte ptr [eax]
test al, al
setnz al
add [ebp+eos], 1
test al, al
jnz short loc_80483F0
mov edx, [ebp+eos]
mov eax, [ebp+arg_0]
mov ecx, edx
sub ecx, eax
mov eax, ecx
sub eax, 1
leave
retn
strlen endp
The result is almost the same as in MSVC, but here we seeMOVZXinstead ofMOVSX.MOVZXstands forMOV with Zero-Extend.
This instruction copies a 8-bit or 16-bit value into a 32-bit register and sets the rest of the bits to 0. In fact, this instruction
is convenient only because it enable us to replace this instruction pair:xor eax, eax / mov al, [...].