Python for Finance: Analyze Big Financial Data

(Elle) #1
In  [ 10 ]: plt.plot(x, f(x),   ‘b’,    label=‘f(x)’)
plt.plot(x, ry, ‘r.’, label=‘regression’)
plt.legend(loc= 0 )
plt.grid(True)
plt.xlabel(‘x’)
plt.ylabel(‘f(x)’)

Figure 9-4. Regression with monomials up to order 7

A brief check reveals that the result is not perfect:


In  [ 11 ]: np.allclose(f(x),   ry)
Out[11]: False

However, the mean squared error (MSE) is not too large — at least, over this narrow range


of x values:


In  [ 12 ]: np.sum((f(x)    -   ry) **   2 )    /   len(x)
Out[12]: 0.0017769134759517413

Individual basis functions


In general, you can reach better regression results when you can choose better sets of basis


functions, e.g., by exploiting knowledge about the function to approximate. In this case,


the individual basis functions have to be defined via a matrix approach (i.e., using a NumPy


ndarray object). First, the case with monomials up to order 3:


In  [ 13 ]: matrix  =   np.zeros(( 3    +    1 ,    len(x)))
matrix[ 3 , :] = x ** 3
matrix[ 2 , :] = x ** 2
matrix[ 1 , :] = x
matrix[ 0 , :] = 1

The sublibrary numpy.linalg provides the function lstsq to solve least-squares


optimization problems like the one in Equation 9-1:


In  [ 14 ]: reg =   np.linalg.lstsq(matrix.T,   f(x))[ 0 ]

Applying lstsq to our problem in this way yields the optimal parameters for the single


basis functions:


In  [ 15 ]: reg
Out[15]: array([ 1.13968447e-14, 5.62777448e-01, -8.88178420e-16,
-5.43553615e-03])

To get the regression estimates we apply the dot function to the reg and matrix arrays.


Figure 9-5 shows the result. np.dot(a, b) simply gives the dot product for the two arrays a


and b:


In  [ 16 ]: ry  =   np.dot(reg, matrix)
Free download pdf