In [ 30 ]: plt.figure(figsize=( 8 , 4 ))
plt.scatter(x, y, c=y, marker=‘v’)
plt.colorbar()
plt.grid(True)
plt.xlabel(‘x’)
plt.ylabel(‘y’)
for i in range(len(trace)):
plt.plot(x, trace[‘alpha’][i] + trace[‘beta’][i] * x)
Figure 11-21. Sample data and regression lines from Bayesian regression
Real Data
Having seen Bayesian regression with PyMC3 in action with dummy data, we now move on
to real market data. In this context, we introduce yet another Python library: zipline (cf.
a Pythonic, open source algorithmic trading library that powers the community
backtesting platform Quantopian.
It is also to be installed separately, e.g., by using pip:
$ pip install zipline
After installation, import zipline as well pytz and datetime as follows:
In [ 31 ]: import warnings
warnings.simplefilter(‘ignore’)
import zipline
import pytz
import datetime as dt
Similar to pandas, zipline provides a convenience function to load financial data from
different sources. Under the hood, zipline also uses pandas.
The example we use is a “classical” pair trading strategy, namely with gold and stocks of
gold mining companies. These are represented by ETFs with the following symbols,
respectively:
GLD
GDX
We can load the data using zipline as follows:
In [ 32 ]: data = zipline.data.load_from_yahoo(stocks=[‘GLD’, ‘GDX’],
end=dt.datetime( 2014 , 3 , 15 , 0 , 0 , 0 , 0 , pytz.utc)).dropna()
data.info()