Python for Finance: Analyze Big Financial Data

Chapter 8. Performance Python

Don’t lower your expectations to meet your performance. Raise your level of performance to meet your

expectations.

— Ralph Marston

When it comes to performance-critical applications two things should always be checked:

are we using the right implementation paradigm and are we using the right performance

libraries? A number of performance libraries can be used to speed up the execution of

Python code. Among others, you will find the following libraries useful, all of which are

presented in this chapter (although in a different order):

Cython, for merging Python with C paradigms for static compilation

IPython.parallel, for the parallel execution of code/functions locally or over a

cluster

numexpr, for fast numerical operations

multiprocessing, Python’s built-in module for (local) parallel processing

Numba, for dynamically compiling Python code for the CPU

NumbaPro, for dynamically compiling Python code for multicore CPUs and GPUs

Throughout this chapter, we compare the performance of different implementations of the

same algorithms. To make the comparison a bit easier, we define a convenience function

that allows us to systematically compare the performance of different functions executed

on the same or different data sets:

In [ 1 ]: def perf_comp_data(func_list, data_list, rep= 3 , number= 1 ): ”’ Function to compare the performance of different functions.

Parameters ========== func_list : list list with function names as strings data_list : list list with data set names as strings rep : int number of repetitions of the whole comparison number : int number of executions for every function ”’ from timeit import repeat res_list = {} for name in enumerate(func_list): stmt = name[ 1 ] + ‘(‘ + data_list[name[ 0 ]] + ‘)’ setup = “from __main__ import “ + name[ 1 ] + ‘, ‘ \ + data_list[name[ 0 ]] results = repeat(stmt=stmt, setup=setup, repeat=rep, number=number) res_list[name[ 1 ]] = sum(results) / rep res_sort = sorted(res_list.iteritems(), key=lambda (k, v): (v, k)) for item in res_sort: rel = item[ 1 ] / res_sort[ 0 ][ 1 ] print ‘function: ‘ + item[ 0 ] + \ ‘, av. time sec: %9.5f, ‘ % item[ 1 ] \ + ‘relative: %6.1f’ % rel