1.20. ARRAYS
LSLS R0, R4, #2 ; R0=i<<2 (same as i4)
LDR R2, [R5,R0] ; load from (R5+R0) (same as R5+i*4)
MOVS R1, R4
ADR R0, aADD ; "a[%d]=%d\n"
BL __2printf
ADDS R4, R4, #1 ; i=i+1
CMP R4, #20 ; i<20?
BLT loc_1DC ; yes, i<20, run loop body again
MOVS R0, #0 ; value to return
; deallocate chunk, allocated for 20 int variables + one more variable
ADD SP, SP, #0x54
POP {R4,R5,PC}
Thumb code is very similar.
Thumb mode has special instructions for bit shifting (likeLSLS), which calculates the value to be written
into the array and the address of each element in the array as well.
The compiler allocates slightly more space in the local stack, however, the last 4 bytes are not used.
Non-optimizing GCC 4.9.1 (ARM64)
Listing 1.226: Non-optimizing GCC 4.9.1 (ARM64)
.LC0:
.string "a[%d]=%d\n"
main:
; save FP and LR in stack frame:
stp x29, x30, [sp, -112]!
; set stack frame (FP=SP)
add x29, sp, 0
; setting initial counter variable at 0 (WZR is the register always holding zero):
str wzr, [x29,108]
; jump to loop condition checking code:
b .L2
.L3:
; load value of "i" variable:
ldr w0, [x29,108]
; multiplicate it by 2:
lsl w2, w0, 1
; find a place of an array in local stack:
add x0, x29, 24
; load 32-bit integer from local stack and sign extend it to 64-bit one:
ldrsw x1, [x29,108]
; calculate address of element (X0+X1<<2=array address+i4) and store W2 (i2) there:
str w2, [x0,x1,lsl 2]
; increment counter (i):
ldr w0, [x29,108]
add w0, w0, 1
str w0, [x29,108]
.L2:
; check if we finished:
ldr w0, [x29,108]
cmp w0, 19
; jump to L3 (loop body begin) if not:
ble .L3
; second part of the function begins here.
; setting initial counter variable at 0.
; by the way, the same place in the local stack was used for counter,
; because the same local variable (i) is being used as counter.
str wzr, [x29,108]
b .L4
.L5:
; calculate array address:
add x0, x29, 24
; load "i" value:
ldrsw x1, [x29,108]
; load value from the array at the address (X0+X1<<2 = address of array + i*4)
ldr w2, [x0,x1,lsl 2]