B 2 is twice as high as for the same event happening with box B 1 . Therefore, having drawn
a black ball, the hypothesis H 2 has with an updated probability two times as
high as the updated probability for hypothesis H 1.
PyMC3
With PyMC3 the Python ecosystem provides a powerful and performant library to
technically implement Bayesian statistics. PyMC3 is (at the time of this writing) not part of
the Anaconda distribution recommended in Chapter 2. On a Linux or a Mac OS X operating
system, the installation comprises mainly the following steps.
First, you need to install the Theano compiler package needed for PyMC3 (cf.
$ git clone git://github.com/Theano/Theano.git
$ sudo python Theano/python.py install
On a Mac OS X system you might need to add the following line to your .bash_profile
file (to be found in your home/user directory):
export DYLD_FALLBACK_LIBRARY_PATH= \
$DYLD_FALLBACK_LIBRARY_PATH:/Library/anaconda/lib:
Once Theano is installed, the installation of PyMC3 is straightforward:
$ git clone https://github.com/pymc-devs/pymc.git
$ cd pymc
$ sudo python setup.py install
If successful, you should be able to import the library named pymc as usual:
In [ 22 ]: import warnings
warnings.simplefilter(‘ignore’)
import pymc as pm
import numpy as np
np.random.seed( 1000 )
import matplotlib.pyplot as plt
%matplotlib inline
PYMC3
PyMC3 is already a powerful library at the time of this writing. However, it is still in its early stages, so you should
expect further enhancements, changes to the API, etc. Make sure to stay up to date by regularly checking the
website when using PyMC3.
Introductory Example
Consider now an example where we have noisy data around a straight line:
[ 45 ]
In [ 23 ]: x = np.linspace( 0 , 10 , 500 )
y = 4 + 2 * x + np.random.standard_normal(len(x)) * 2
As a benchmark, consider first an ordinary least-squares regression given the noisy data,
using NumPy’s polyfit function (cf. Chapter 9). The regression is implemented as follows:
In [ 24 ]: reg = np.polyfit(x, y, 1 )
# linear regression
Figure 11-19 shows the data and the regression line graphically:
In [ 25 ]: plt.figure(figsize=( 8 , 4 ))
plt.scatter(x, y, c=y, marker=‘v’)
plt.plot(x, reg[ 1 ] + reg[ 0 ] * x, lw=2.0)
plt.colorbar()
plt.grid(True)
plt.xlabel(‘x’)