Python for Finance: Analyze Big Financial Data

(Elle) #1
U

Unicode

U24 (24 Unicode characters)

V

Other

V12 (12-byte data block)

NumPy provides a generalization of regular arrays that loosens at least the dtype restriction,


but let us stick with regular arrays for a moment and see what the specialization brings in


terms of performance.


As a simple exercise, suppose we want to generate a matrix/array of shape 5,000 × 5,000


elements, populated with (pseudo)random, standard normally distributed numbers. We


then want to calculate the sum of all elements. First, the pure Python approach, where we


make heavy use of list comprehensions and functional programming methods as well as


lambda functions:


In  [ 111 ]:    import random
I = 5000
In [ 112 ]: %time mat = [[random.gauss( 0 , 1 ) for j in range(I)] for i in range(I)]
# a nested list comprehension
Out[112]: CPU times: user 36.5 s, sys: 408 ms, total: 36.9 s
Wall time: 36.4 s
In [ 113 ]: %time reduce(lambda x, y: x + y, \
[reduce(lambda x, y: x + y, row) \
for row in mat])
Out[113]: CPU times: user 4.3 s, sys: 52 ms, total: 4.35 s
Wall time: 4.07 s

678.5908519876674

Let us now turn to NumPy and see how the same problem is solved there. For convenience,


the NumPy sublibrary random offers a multitude of functions to initialize a numpy.ndarray


object and populate it at the same time with (pseudo)random numbers:


In  [ 114 ]:    %time mat   =   np.random.standard_normal((I,   I))
Out[114]: CPU times: user 1.83 s, sys: 40 ms, total: 1.87 s
Wall time: 1.87 s
In [ 115 ]: %time mat.sum()
Out[115]: CPU times: user 36 ms, sys: 0 ns, total: 36 ms
Wall time: 34.6 ms

                                        349.49777911439384

We observe the following:


Syntax


Although we use several approaches to compact the pure Python code, the NumPy


version is even more compact and readable.


Performance


The generation of the numpy.ndarray object is roughly 20 times faster and the


calculation of the sum is roughly 100 times faster than the respective operations in


pure Python.


USING NUMPY ARRAYS

The use of NumPy for array-based operations and algorithms generally results in compact, easily readable code and

significant performance improvements over pure Python code.
Free download pdf