Assembly Language for Beginners

(Jeff_L) #1

3.27. OPENMP

add esp, 12
; Line 56
mov ecx, DWORD PTR _checked
mov eax, ecx
mov esi, 100000 ; 000186a0H
idiv esi
test edx, edx
jne SHORT $LN1@check_nonc
; Line 57
push ecx
push OFFSET ??_C@_0M@NPNHLIOO@checked?$DN?$CFd?6?$AA@
call _printf
pop ecx
pop ecx
call __vcomp_leave_critsect
pop ecx

As it turns out, thevcomp_atomic_add_i4()function in the vcomp*.dll is just a tiny function with theLOCK
XADDinstruction^59 in it.

vcomp_enter_critsect()eventually calling win32APIfunction

3.27.2 GCC

GCC 4.8.1 produces a program which shows exactly the same statistics table,

so, GCC’s implementation divides the loop in parts in the same fashion.

Listing 3.125: GCC 4.8.1
mov edi, OFFSET FLAT:main._omp_fn.0
call GOMP_parallel_start
mov edi, 0
call main._omp_fn.0
call GOMP_parallel_end

Unlike MSVC’s implementation, what GCC code does is to start 3 threads, and run the fourth in the current
thread. So there are 4 threads instead of the 5 in MSVC.

Here is themain._omp_fn.0function:

Listing 3.126: GCC 4.8.1

push rbp
mov rbp, rsp
push rbx
sub rsp, 40
mov QWORD PTR [rbp-40], rdi
call omp_get_num_threads
mov ebx, eax
call omp_get_thread_num
mov esi, eax
mov eax, 2147483647 ; 0x7FFFFFFF
idiv ebx
mov ecx, eax
mov eax, 2147483647 ; 0x7FFFFFFF
idiv ebx
mov eax, edx
cmp esi, eax
jl .L15

(^59) Read more about LOCK prefix:.1.6 on page 1026
(^60) You can read more about critical sections here:6.5.4 on page 787

Free download pdf