CHAPTER 18. ARRAYS CHAPTER 18. ARRAYS
_y$ = 12 ; size = 4
_z$ = 16 ; size = 4
_value$ = 20 ; size = 4
_insert PROC
push ebp
mov ebp, esp
mov eax, DWORD PTR _x$[ebp]
imul eax, 2400 ; eax=6004x
mov ecx, DWORD PTR _y$[ebp]
imul ecx, 120 ; ecx=304y
lea edx, DWORD PTR _a[eax+ecx] ; edx=a + 6004x + 304y
mov eax, DWORD PTR _z$[ebp]
mov ecx, DWORD PTR _value$[ebp]
mov DWORD PTR [edx+eax4], ecx ; (edx+z*4)=value
pop ebp
ret 0
_insert ENDP
_TEXT ENDS
Nothing special. For index calculation, three input arguments are used in the formulaaddress= 600⋅ 4 ⋅x+ 30⋅ 4 ⋅y+ 4z,
to represent the array as multidimensional. Do not forget that theinttype is 32-bit (4 bytes), so all coefficients must be
multiplied by 4.
Listing 18.26: GCC 4.4.1
public insert
insert proc near
x = dword ptr 8
y = dword ptr 0Ch
z = dword ptr 10h
value = dword ptr 14h
push ebp
mov ebp, esp
push ebx
mov ebx, [ebp+x]
mov eax, [ebp+y]
mov ecx, [ebp+z]
lea edx, [eax+eax] ; edx=y2
mov eax, edx ; eax=y2
shl eax, 4 ; eax=(y2)<<4 = y216 = y32
sub eax, edx ; eax=y32 - y2=y30
imul edx, ebx, 600 ; edx=x600
add eax, edx ; eax=eax+edx=y30 + x600
lea edx, [eax+ecx] ; edx=y30 + x600 + z
mov eax, [ebp+value]
mov dword ptr ds:a[edx4], eax ; (a+edx*4)=value
pop ebx
pop ebp
retn
insert endp
The GCC compiler does it differently. For one of the operations in the calculation ( 30 y), GCC produces code without multi-
plication instructions. This is how it done:(y+y)≪ 4 −(y+y) = (2y)≪ 4 − 2 y= 2⋅ 16 ⋅y− 2 y= 32y− 2 y= 30y. Thus, for
the 30 ycalculation, only one addition operation, one bitwise shift operation and one subtraction operation are used. This
works faster.
ARM + Non-optimizing Xcode 4.6.3 (LLVM) (Thumb mode)
Listing 18.27: Non-optimizing Xcode 4.6.3 (LLVM) (Thumb mode)
_insert
value = -0x10
z = -0xC
y = -8
x = -4