In [ 84 ]: from nested_loop import f_cy
Now, we can check the performance of the Cython function:
In [ 85 ]: %time res = f_cy(I, J)
Out[85]: CPU times: user 154 ms, sys: 0 ns, total: 154 ms
Wall time: 153 ms
In [ 86 ]: res
Out[86]: 125000000.0
When working in IPython Notebook there is a more convenient way to use Cython —
cythonmagic:
In [ 87 ]: %load_ext cythonmagic
Loading this extension from within the IPython Notebook allows us to compile code with
Cython from within the tool:
In [ 88 ]: %%cython
#
# Nested loop example with Cython
#
def f_cy(int I, int J):
cdef double res = 0
# double float much slower than int or long
for i in range(I):
for j in range (J * I):
res += 1
return res
The performance results should, of course, be (almost) the same:
In [ 89 ]: %time res = f_cy(I, J)
Out[89]: CPU times: user 156 ms, sys: 0 ns, total: 156 ms
Wall time: 154 ms
In [ 90 ]: res
Out[90]: 125000000.0
Let us see what Numba can do in this case. The application is as straightforward as before:
In [ 91 ]: import numba as nb
In [ 92 ]: f_nb = nb.jit(f_py)
The performance is — when invoking the function for the first time — worse than that of
the Cython version (recall that with the first call of the Numba compiled function there is
always some overhead involved):
In [ 93 ]: %time res = f_nb(I, J)
Out[93]: CPU times: user 285 ms, sys: 9 ms, total: 294 ms
Wall time: 273 ms
In [ 94 ]: res
Out[94]: 125000000.0
Finally, the more rigorous comparison — showing that the Numba version indeed keeps up
with the Cython version(s):
In [ 95 ]: func_list = [‘f_py’, ‘f_cy’, ‘f_nb’]
I, J = 500 , 500
data_list = 3 * [‘I, J’]
In [ 96 ]: perf_comp_data(func_list, data_list)
Out[96]: function: f_nb, av. time sec: 0.15162, relative: 1.0
function: f_cy, av. time sec: 0.15275, relative: 1.0
function: f_py, av. time sec: 14.08304, relative: 92.9