Python for Finance: Analyze Big Financial Data

Memory Layout and Performance

NumPy allows the specification of a so-called dtype per ndarray object: for example,

np.int32 or f8. NumPy also allows us to choose from two different memory layouts when

initializing an ndarray object. Depending on the structure of the object, one layout can

have advantages compared to the other. This is illustrated in the following:

In [ 21 ]: import numpy as np In [ 22 ]: np.zeros(( 3 , 3 ), dtype=np.float64, order=‘C’) Out[22]: array([[ 0., 0., 0.], [ 0., 0., 0.], [ 0., 0., 0.]])

The way you initialize a NumPy ndarray object can have a significant influence on the

performance of operations on these arrays (given a certain size of array). In summary, the

initialization of an ndarray object (e.g., via np.zeros or np.array) takes as input:

shape

Either an int, a sequence of ints, or a reference to another numpy.ndarray

dtype (optional)

A numpy.dtype — these are NumPy-specific basic data types for numpy.ndarray

objects

order (optional)

The order in which to store elements in memory: C for C-like (i.e., row-wise) or F for

Fortran-like (i.e., column-wise)

Consider the C-like (i.e., row-wise), storage:

In [ 23 ]: c = np.array([[ 1., 1., 1.], [ 2., 2., 2.], [ 3., 3., 3.]], order=‘C’)

In this case, the 1s, the 2s, and the 3s are stored next to each other. By contrast, consider

the Fortran-like (i.e., column-wise) storage:

In [ 24 ]: f = np.array([[ 1., 1., 1.], [ 2., 2., 2.], [ 3., 3., 3.]], order=‘F’)

Now, the data is stored in such a way that 1, 2, and 3 are next to each other in each

column. Let’s see whether the memory layout makes a difference in some way when the

array is large:

In [ 25 ]: x = np.random.standard_normal(( 3 , 1500000 )) C = np.array(x, order=‘C’) F = np.array(x, order=‘F’) x = 0.0

Now let’s implement some standard operations on the C-like layout array. First, calculating

sums:

In [ 26 ]: %timeit C.sum(axis= 0 ) Out[26]: 100 loops, best of 3: 11.3 ms per loop In [ 27 ]: %timeit C.sum(axis= 1 ) Out[27]: 100 loops, best of 3: 5.84 ms per loop

Calculating sums over the first axis is roughly two times slower than over the second axis.