Python for Finance: Analyze Big Financial Data

(Elle) #1

High-Frequency Data


By now, you should have a feeling for the strengths of pandas when it comes to financial


time series data. One aspect in this regard has become prevalent in the financial analytics


sphere and represents quite a high burden for some market players: high-frequency data.


This brief section illustrates how to cope with tick data instead of daily financial data. To


begin with, a couple of imports:


In  [ 86 ]: import numpy as np
import pandas as pd
import datetime as dt
from urllib import urlretrieve
%matplotlib inline

The Norwegian online broker Netfonds provides tick data for a multitude of stocks, in


particular for American names. The web-based API has basically the following format:


In  [ 87 ]: url1    =   ‘http://hopey.netfonds.no/posdump.php?’
url2 = ‘date=%s%s%s&paper=AAPL.O&csv_format=csv’
url = url1 + url2

We want to download, combine, and analyze a week’s worth of tick data for the Apple Inc.


stock, a quite actively traded name. Let us start with the dates of interest:


[ 27 ]

In  [ 88 ]: year    =   ‘2014’
month = ‘09’
days = [‘22’, ‘23’, ‘24’, ‘25’]
# dates might need to be updated
In [ 89 ]: AAPL = pd.DataFrame()
for day in days:
AAPL = AAPL.append(pd.read_csv(url % (year, month, day),
index_col= 0 , header= 0 , parse_dates=True))
AAPL.columns = [‘bid’, ‘bdepth’, ‘bdeptht’,
‘offer’, ‘odepth’, ‘odeptht’]
# shorter colummn names

The data set now consists of almost 100,000 rows:


In  [ 90 ]: AAPL.info()
Out[90]: <class ‘pandas.core.frame.DataFrame’>
DatetimeIndex: 95871 entries, 2014-09-22 10:00:01 to 2014-09-25 22:19:25
Data columns (total 6 columns):
bid 95871 non-null float64
bdepth 95871 non-null float64
bdeptht 95871 non-null float64
offer 95871 non-null float64
odepth 95871 non-null float64
odeptht 95871 non-null float64
dtypes: float64(6)

Figure 6-11 shows the bid columns graphically. One can identify a number of periods


without any trading activity — i.e., times when the markets are closed:


In  [ 91 ]: AAPL[‘bid’].plot()

Over the course of a single trading day when markets are open, there is of course usually a


high activity level. Figure 6-12 shows the trading activity for the first day in the sample


and three hours of the third. Times are for the Norwegian time zone and you can see easily


when pre-trading starts, when US stock markets are open, and when they close:


In  [ 92 ]: to_plot =   AAPL[[‘bid’,    ‘bdeptht’]][
(AAPL.index > dt.datetime( 2014 , 9 , 22 , 0 , 0 ))
& (AAPL.index < dt.datetime( 2014 , 9 , 23 , 2 , 59 ))]
# adjust dates to given data set
to_plot.plot(subplots=True, style=‘b’, figsize=( 8 , 5 ))
Free download pdf