Figure 12-3. Screenshot of workbook in Excel written with pandas
As a final use case for pandas and Excel, consider the reading and writing of larger
amounts of data. Although this is not a fast operation, it might be useful in some
circumstances. First, the sample data to be used:
In [ 58 ]: data = np.random.rand( 20 , 100000 )
In [ 59 ]: data.nbytes
Out[59]: 16000000
Second, generate a DataFrame object out of the sample data:
In [ 60 ]: df = pd.DataFrame(data)
Third, write it as an Excel file to the disk:
In [ 61 ]: %time df.to_excel(path + ‘data.xlsx’, ‘data_sheet’)
Out[61]: CPU times: user 1min 25s, sys: 460 ms, total: 1min 26s
Wall time: 1min 25s
This takes quite a while. For comparison, see how fast native storage of the NumPy
ndarray object is (on an SSD drive):
In [ 62 ]: %time np.save(path + ‘data’, data)
Out[62]: CPU times: user 8 ms, sys: 20 ms, total: 28 ms
Wall time: 159 ms
In [ 63 ]: ll $path*
Out[63]: -rw––- 1 yhilpisch 7372 Sep 28 18:18 data/chart.xlsx