Python for Finance: Analyze Big Financial Data

(Elle) #1

Chapter 8. Performance Python


Don’t lower your expectations to meet your performance. Raise your level of performance to meet your

expectations.

— Ralph Marston

When it comes to performance-critical applications two things should always be checked:


are we using the right implementation paradigm and are we using the right performance


libraries? A number of performance libraries can be used to speed up the execution of


Python code. Among others, you will find the following libraries useful, all of which are


presented in this chapter (although in a different order):


Cython, for merging Python with C paradigms for static compilation


IPython.parallel, for the parallel execution of code/functions locally or over a


cluster


numexpr, for fast numerical operations


multiprocessing, Python’s built-in module for (local) parallel processing


Numba, for dynamically compiling Python code for the CPU


NumbaPro, for dynamically compiling Python code for multicore CPUs and GPUs


Throughout this chapter, we compare the performance of different implementations of the


same algorithms. To make the comparison a bit easier, we define a convenience function


that allows us to systematically compare the performance of different functions executed


on the same or different data sets:


In  [ 1 ]:  def perf_comp_data(func_list,   data_list,  rep= 3 ,    number= 1 ):
”’ Function to compare the performance of different functions.

                                                Parameters
==========
func_list : list
list with function names as strings
data_list : list
list with data set names as strings
rep : int
number of repetitions of the whole comparison
number : int
number of executions for every function
”’
from timeit import repeat
res_list = {}
for name in enumerate(func_list):
stmt = name[ 1 ] + ‘(‘ + data_list[name[ 0 ]] + ‘)’
setup = “from __main__ import “ + name[ 1 ] + ‘, ‘ \
+ data_list[name[ 0 ]]
results = repeat(stmt=stmt, setup=setup,
repeat=rep, number=number)
res_list[name[ 1 ]] = sum(results) / rep
res_sort = sorted(res_list.iteritems(),
key=lambda (k, v): (v, k))
for item in res_sort:
rel = item[ 1 ] / res_sort[ 0 ][ 1 ]
print ‘function: ‘ + item[ 0 ] + \
‘, av. time sec: %9.5f, ‘ % item[ 1 ] \
+ ‘relative: %6.1f’ % rel
Free download pdf