Python for Finance: Analyze Big Financial Data

(Elle) #1
In  [ 30 ]: plt.figure(figsize=( 8 ,     4 ))
plt.scatter(x, y, c=y, marker=‘v’)
plt.colorbar()
plt.grid(True)
plt.xlabel(‘x’)
plt.ylabel(‘y’)
for i in range(len(trace)):
plt.plot(x, trace[‘alpha’][i] + trace[‘beta’][i] * x)

Figure 11-21. Sample data and regression lines from Bayesian regression

Real Data


Having seen Bayesian regression with PyMC3 in action with dummy data, we now move on


to real market data. In this context, we introduce yet another Python library: zipline (cf.


https://github.com/quantopian/zipline and https://pypi.python.org/pypi/zipline). zipline is


a Pythonic, open source algorithmic trading library that powers the community


backtesting platform Quantopian.


It is also to be installed separately, e.g., by using pip:


$   pip install zipline

After installation, import zipline as well pytz and datetime as follows:


In  [ 31 ]: import warnings
warnings.simplefilter(‘ignore’)
import zipline
import pytz
import datetime as dt

Similar to pandas, zipline provides a convenience function to load financial data from


different sources. Under the hood, zipline also uses pandas.


The example we use is a “classical” pair trading strategy, namely with gold and stocks of


gold mining companies. These are represented by ETFs with the following symbols,


respectively:


GLD

GDX

We can load the data using zipline as follows:


In  [ 32 ]: data    =   zipline.data.load_from_yahoo(stocks=[‘GLD’, ‘GDX’],
end=dt.datetime( 2014 , 3 , 15 , 0 , 0 , 0 , 0 , pytz.utc)).dropna()
data.info()
Free download pdf