Assembly Language for Beginners

(nextflipdebug2) #1

3.12. C99 RESTRICT



  • calculate nextsum_product[i]—on this stage, we need to load from memory the already calculated
    sum[i]andproduct[i]


Is it possible to optimize the last stage? Since we have already calculatedsum[i]andproduct[i]it is
not necessary to load them again from memory.


Yes, but compiler is not sure that nothing has been overwritten at the 3rd stage! This is called “pointer
aliasing”, asituationwhenthecompilercannotbesurethatamemorytowhichapointerispointinghasn’t
been changed.


restrictin the C99 standard [ISO/IEC 9899:TC3 (C C99 standard), (2007) 6.7.3/1] is a promise, given by
programmertothecompilerthatthefunctionargumentsmarkedbythiskeywordalwayspointstodifferent
memory locations and never intersects.


To be more precise and describe this formally,restrictshows that only this pointer is to be used to access
an object, and no other pointer will be used for it.


It can be even said the object will be accessed only via one single pointer, if it is marked asrestrict.


Let’s add this keyword to each pointer argument:


void f2 (int restrict x, int restrict y, int restrict sum, int restrict product, int
Çrestrict sum_product,
int
restrict update_me, size_t s)
{
for (int i=0; i<s; i++)
{
sum[i]=x[i]+y[i];
product[i]=x[i]y[i];
update_me[i]=i
123; // some dummy value
sum_product[i]=sum[i]+product[i];
};
};


Let’s see results:


Listing 3.53: GCC x64: f1()

f1:
push r15 r14 r13 r12 rbp rdi rsi rbx
mov r13, QWORD PTR 120[rsp]
mov rbp, QWORD PTR 104[rsp]
mov r12, QWORD PTR 112[rsp]
test r13, r13
je .L1
add r13, 1
xor ebx, ebx
mov edi, 1
xor r11d, r11d
jmp .L4
.L6:
mov r11, rdi
mov rdi, rax
.L4:
lea rax, 0[0+r114]
lea r10, [rcx+rax]
lea r14, [rdx+rax]
lea rsi, [r8+rax]
add rax, r9
mov r15d, DWORD PTR [r10]
add r15d, DWORD PTR [r14]
mov DWORD PTR [rsi], r15d ; store to sum[]
mov r10d, DWORD PTR [r10]
imul r10d, DWORD PTR [r14]
mov DWORD PTR [rax], r10d ; store to product[]
mov DWORD PTR [r12+r11
4], ebx ; store to update_me[]
add ebx, 123
mov r10d, DWORD PTR [rsi] ; reload sum[i]
add r10d, DWORD PTR [rax] ; reload product[i]
lea rax, 1[rdi]
cmp rax, r13
mov DWORD PTR 0[rbp+r11*4], r10d ; store to sum_product[]

Free download pdf