Python for Finance: Analyze Big Financial Data

(Elle) #1
ax.set_zlabel(‘f(x, y)’)
fig.colorbar(surf, shrink=0.5, aspect= 5 )

Figure 9-9. Function with two parameters

To get good regression results we compile a set of basis functions, including both a sin


and a sqrt function, which leverages our knowledge of the example function:


In  [ 33 ]: matrix  =   np.zeros((len(x),    6  +    1 ))
matrix[:, 6 ] = np.sqrt(y)
matrix[:, 5 ] = np.sin(x)
matrix[:, 4 ] = y ** 2
matrix[:, 3 ] = x ** 2
matrix[:, 2 ] = y
matrix[:, 1 ] = x
matrix[:, 0 ] = 1

The statsmodels library offers the quite general and helpful function OLS for least-squares


regression both in one dimension and multiple dimensions:


[ 33 ]

In  [ 34 ]: import statsmodels.api as sm
In [ 35 ]: model = sm.OLS(fm((x, y)), matrix).fit()

One advantage of using the OLS function is that it provides a wealth of additional


information about the regression and its quality. A summary of the results is accessed by


calling model.summary. Single statistics, like the coefficient of determination, can in


general also be accessed directly:


In  [ 36 ]: model.rsquared
Out[36]: 1.0

For our purposes, we of course need the optimal regression parameters, which are stored


in the params attribute of our model object:


In  [ 37 ]: a   =   model.params
a
Out[37]: array([ 7.14706072e-15, 2.50000000e-01, -2.22044605e-16,
-1.02348685e-16, 5.00000000e-02, 1.00000000e+00,
1.00000000e+00])

The function reg_func gives back, for the given optimal regression parameters and the


indpendent data points, the function values for the regression function:


In  [ 38 ]: def reg_func(a, (x, y)):
f6 = a[ 6 ] * np.sqrt(y)
Free download pdf