Assembly Language for Beginners

(Jeff_L) #1

1.17. MORE ABOUT STRINGS


MVNS^106 (inverts all bits, likeNOTin x86) andADDinstructions computeeos−str− 1. In fact, these two
instructions computeR0 = str+eos, which is effectively equivalent to what was in the source code, and
why it is so, was already explained here (1.17.1 on page 208).


Apparently, LLVM, just like GCC, concludes that this code can be shorter (or faster).


Optimizing Keil 6/2013 (ARM mode)


Listing 1.186: Optimizing Keil 6/2013 (ARM mode)

_strlen
MOV R1, R0


loc_2C8
LDRB R2, [R1],#1
CMP R2, #0
SUBEQ R0, R1, R0
SUBEQ R0, R0, #1
BNE loc_2C8
BX LR


Almostthesameaswhatwesawbefore,withtheexceptionthatthestr−eos− 1 expressioncanbecomputed
not at the function’s end, but right in the body of the loop. The-EQsuffix, as we may recall, implies that
the instruction executes only if the operands in theCMPthat has been executed before were equal to each
other. Thus, ifR0contains 0, bothSUBEQinstructions executes and result is left in theR0register.


ARM64


Optimizing GCC (Linaro) 4.9


my_strlen:
mov x1, x0
; X1 is now temporary pointer (eos), acting like cursor
.L58:
; load byte from X1 to W2, increment X1 (post-index)
ldrb w2, [x1],1
; Compare and Branch if NonZero: compare W2 with 0, jump to .L58 if it is not
cbnz w2, .L58
; calculate difference between initial pointer in X0 and current address in X1
sub x0, x1, x0
; decrement lowest 32-bit of result
sub w0, w0, #1
ret


The algorithm is the same as in1.17.1 on page 202: find a zero byte, calculate the difference between
the pointers and decrement the result by 1. Some comments were added by the author of this book.


The only thing worth noting is that our example is somewhat wrong:
my_strlen()returns 32-bitint, while it has to returnsize_tor another 64-bit type.


The reason is that, theoretically,strlen()can be called for a huge blocks in memory that exceeds 4GB,
so it must able to return a 64-bit value on 64-bit platforms.


Becauseofmymistake, thelastSUBinstructionoperatesona32-bitpartofregister, whilethepenultimate
SUBinstruction works on full the 64-bit register (it calculates the difference between the pointers).


It’s my mistake, it is better to leave it as is, as an example of how the code could look like in such case.


(^106) MoVe Not

Free download pdf