Figure 6-11. Apple stock tick data for a week
Figure 6-12. Apple stock tick data and volume for a trading day
Usually, financial tick data series lead to a DatetimeIndex that is highly irregular. In other
words, time intervals between two observation points are highly heterogeneous. Against
this background, a resampling of such data sets might sometimes be useful or even in
order depending on the task at hand. pandas provides a method for this purpose for the
DataFrame object. In what follows, we simply take the mean for the resampling procedure;
this might be consistent for some columns (e.g., “bid”) but not for others (e.g., “bdepth”):
In [ 93 ]: AAPL_resam = AAPL.resample(rule=‘5min’, how=‘mean’)
np.round(AAPL_resam.head(), 2 )
Out[93]: bid bdepth bdeptht offer odepth odeptht
2014-09-22 10:00:00 100.49 366.67 366.67 100.95 200 200
2014-09-22 10:05:00 100.49 100.00 100.00 100.84 200 200
2014-09-22 10:10:00 100.54 150.00 150.00 100.74 100 100
2014-09-22 10:15:00 100.59 200.00 200.00 100.75 1500 1500
2014-09-22 10:20:00 100.50 100.00 100.00 100.75 1500 1500
The resulting plot in Figure 6-13 looks a bit smoother. Here, we have also filled empty
time intervals with the most recent available values (before the empty time interval):
In [ 94 ]: AAPL_resam[‘bid’].fillna(method=‘ffill’).plot()