Python for Finance: Analyze Big Financial Data

1 -0.788818 -0.985819 0.403796 -1.753784

2 -0.155881 -1.752672 1.037444 -0.400793

3 -0.777546 1.730278 0.417114 0.184079

4 -1.763660 -0.375469 0.098678 -1.553824

5 -1.134258 1.401821 1.227124 0.979389

6 0.458838 -0.143187 1.565701 -2.085863

7 -0.103058 -0.366170 -0.478036 -0.032810

8 1.040318 -0.128799 0.786187 0.414084

The column names provide an efficient mechanism to access data in the DataFrame object,

again similar to structured arrays:

In  [ 23 ]: df[‘No2’][ 3 ]      #   value   in  column  No2 at  index   position    3
Out[23]: 1.7302783624820191

To work with financial time series data efficiently, you must be able to handle time indices

well. This can also be considered a major strength of pandas. For example, assume that

our nine data entries in the four columns correspond to month-end data, beginning in

January 2015. A DatetimeIndex object is then generated with date_range as follows:

In  [ 24 ]: dates   =   pd.date_range(‘2015-1-1’,   periods= 9 ,    freq=‘M’)
Out[24]: <class ‘pandas.tseries.index.DatetimeIndex’>
[2015-01-31, ..., 2015-09-30]
Length: 9, Freq: M, Timezone: None

Table 6-2 lists the parameters that the date_range function takes.

Table 6-2. Parameters of date_range function

Parameter Format Description



left bound for generating dates



right bound for generating dates



number of periods (if start or end is None)



frequency string, e.g., 5D for 5 days



time zone name for localized index


bool, default None

normalize start and end to midnight


string, default None

name of resulting index

So far, we have only encountered indices composed of string and int objects. For time

series data, however, a DatetimeIndex object generated with the date_range function is

of course what is needed.

As with the columns, we assign the newly generated DatetimeIndex as the new Index

object to the DataFrame object:

