Reverse Engineering for Beginners

(avery) #1

CHAPTER 47. STRINGS TRIMMING CHAPTER 47. STRINGS TRIMMING


The second part of for() (str_len>0 && (c=s[str_len-1])) is the so called “short-circuit” in C/C++ and is very conve-
nient [Yur13, p. 1.3.8]. The C/C++ compilers guarantee an evaluation sequence from left to right. So if the first clause is
false after evaluation, the second one is never to be evaluated.


47.1 x64: Optimizing MSVC 2013


Listing 47.1: Optimizing MSVC 2013 x64

s$ = 8
str_trim PROC


; RCX is the first function argument and it always holds pointer to the string
mov rdx, rcx
; this is strlen() function inlined right here:
; set RAX to 0xFFFFFFFFFFFFFFFF (-1)
or rax, -1
$LL14@str_trim:
inc rax
cmp BYTE PTR [rcx+rax], 0
jne SHORT $LL14@str_trim
; is the input string length zero? exit then:
test rax, rax
je SHORT $LN15@str_trim
; RAX holds string length
dec rcx
; RCX = s-1
mov r8d, 1
add rcx, rax
; RCX = s-1+strlen(s), i.e., this is the address of the last character in the string
sub r8, rdx
; R8 = 1-s
$LL6@str_trim:
; load the last character of the string:
; jump, if its code is 13 or 10:
movzx eax, BYTE PTR [rcx]
cmp al, 13
je SHORT $LN2@str_trim
cmp al, 10
jne SHORT $LN15@str_trim
$LN2@str_trim:
; the last character has a 13 or 10 code
; write zero at this place:
mov BYTE PTR [rcx], 0
; decrement address of the last character,
; so it will point to the character before the one which has just been erased:
dec rcx
lea rax, QWORD PTR [r8+rcx]
; RAX = 1 - s + address of the current last character
; thus we can determine if we reached the first character and we need to stop, if it is so
test rax, rax
jne SHORT $LL6@str_trim
$LN15@str_trim:
mov rax, rdx
ret 0
str_trim ENDP


First, MSVC inlined thestrlen()function code, because it concluded this is to be faster than the usualstrlen()work +
the cost of calling it and returning from it. This is called inlining:43 on page 481.


The first instruction of the inlinedstrlen()isOR RAX, 0xFFFFFFFFFFFFFFFF. It’s hard to say why MSVC usesOR
instead ofMOV RAX, 0xFFFFFFFFFFFFFFFF, but it does this often. And of course, it is equivalent: all bits are set, and
a number with all bits set is− 1 in two’s complement arithmetic:30 on page 431.


Why would the− 1 number be used instrlen(), one might ask. Due to optimizations, of course. Here is the code that
MSVC generated:


Listing 47.2: Inlinedstrlen()by MSVC 2013 x64

; RCX = pointer to the input string

Free download pdf