Python for Finance: Analyze Big Financial Data

1 -0.788818 -0.985819 0.403796 -1.753784

2 -0.155881 -1.752672 1.037444 -0.400793

3 -0.777546 1.730278 0.417114 0.184079

4 -1.763660 -0.375469 0.098678 -1.553824

5 -1.134258 1.401821 1.227124 0.979389

6 0.458838 -0.143187 1.565701 -2.085863

7 -0.103058 -0.366170 -0.478036 -0.032810

8 1.040318 -0.128799 0.786187 0.414084

The column names provide an efficient mechanism to access data in the DataFrame object,

again similar to structured arrays:

In [ 23 ]: df[‘No2’][ 3 ] # value in column No2 at index position 3 Out[23]: 1.7302783624820191

To work with financial time series data efficiently, you must be able to handle time indices

well. This can also be considered a major strength of pandas. For example, assume that

our nine data entries in the four columns correspond to month-end data, beginning in

January 2015. A DatetimeIndex object is then generated with date_range as follows:

In [ 24 ]: dates = pd.date_range(‘2015-1-1’, periods= 9 , freq=‘M’) dates Out[24]: <class ‘pandas.tseries.index.DatetimeIndex’> [2015-01-31, ..., 2015-09-30] Length: 9, Freq: M, Timezone: None

Table 6-2 lists the parameters that the date_range function takes.

Table 6-2. Parameters of date_range function

Parameter Format Description

start

string/datetime

left bound for generating dates

end

string/datetime

right bound for generating dates

periods

integer/None

number of periods (if start or end is None)

freq

string/DateOffset

frequency string, e.g., 5D for 5 days

tz

string/None

time zone name for localized index

normalize

bool, default None

normalize start and end to midnight

name

string, default None

name of resulting index

So far, we have only encountered indices composed of string and int objects. For time

series data, however, a DatetimeIndex object generated with the date_range function is

of course what is needed.

As with the columns, we assign the newly generated DatetimeIndex as the new Index

object to the DataFrame object:

Python for Finance: Analyze Big Financial Data

1 -0.788818 -0.985819 0.403796 -1.753784

2 -0.155881 -1.752672 1.037444 -0.400793

3 -0.777546 1.730278 0.417114 0.184079

4 -1.763660 -0.375469 0.098678 -1.553824

5 -1.134258 1.401821 1.227124 0.979389

6 0.458838 -0.143187 1.565701 -2.085863

7 -0.103058 -0.366170 -0.478036 -0.032810

8 1.040318 -0.128799 0.786187 0.414084

The column names provide an efficient mechanism to access data in the DataFrame object,

again similar to structured arrays:

To work with financial time series data efficiently, you must be able to handle time indices

well. This can also be considered a major strength of pandas. For example, assume that

our nine data entries in the four columns correspond to month-end data, beginning in

January 2015. A DatetimeIndex object is then generated with date_range as follows:

Table 6-2 lists the parameters that the date_range function takes.

Table 6-2. Parameters of date_range function

Parameter Format Description

start

string/datetime

left bound for generating dates

end

string/datetime

right bound for generating dates

periods

integer/None

number of periods (if start or end is None)

freq

string/DateOffset

frequency string, e.g., 5D for 5 days

tz

string/None

time zone name for localized index

normalize

bool, default None

normalize start and end to midnight

name

string, default None

name of resulting index

So far, we have only encountered indices composed of string and int objects. For time

series data, however, a DatetimeIndex object generated with the date_range function is

of course what is needed.

As with the columns, we assign the newly generated DatetimeIndex as the new Index

object to the DataFrame object:

Get our desktop app

Company

Features

Documentation

Resources