Python for Finance: Analyze Big Financial Data

(Elle) #1
Figure 12-3. Screenshot of workbook in Excel written with pandas

As a final use case for pandas and Excel, consider the reading and writing of larger


amounts of data. Although this is not a fast operation, it might be useful in some


circumstances. First, the sample data to be used:


In  [ 58 ]: data    =   np.random.rand( 20 ,     100000 )
In [ 59 ]: data.nbytes
Out[59]: 16000000

Second, generate a DataFrame object out of the sample data:


In  [ 60 ]: df  =   pd.DataFrame(data)

Third, write it as an Excel file to the disk:


In  [ 61 ]: %time df.to_excel(path  +   ‘data.xlsx’,    ‘data_sheet’)
Out[61]: CPU times: user 1min 25s, sys: 460 ms, total: 1min 26s
Wall time: 1min 25s

This takes quite a while. For comparison, see how fast native storage of the NumPy


ndarray object is (on an SSD drive):


In  [ 62 ]: %time np.save(path  +   ‘data’, data)
Out[62]: CPU times: user 8 ms, sys: 20 ms, total: 28 ms
Wall time: 159 ms
In [ 63 ]: ll $path*
Out[63]: -rw––- 1 yhilpisch 7372 Sep 28 18:18 data/chart.xlsx
Free download pdf