Python for Finance: Analyze Big Financial Data

(Elle) #1
In  [ 21 ]: np.sum((f(x)    -   ry) **   2 )    /   len(x)
Out[21]: 2.2749084503102031e-31

In fact, the minimization routine recovers the correct parameters of 1 for the sin part and


0.5 for the linear part:


In  [ 22 ]: reg
Out[22]: array([ 1.55428020e-16, 5.00000000e-01, 0.00000000e+00,
1.00000000e+00])

Noisy data


Regression can cope equally well with noisy data, be it data from simulation or from (non-


perfect) measurements. To illustrate this point, let us generate both independent


observations with noise and also dependent observations with noise:


In  [ 23 ]: xn  =   np.linspace(- 2     *   np.pi,   2  *   np.pi,   50 )
xn = xn + 0.15 * np.random.standard_normal(len(xn))
yn = f(xn) + 0.25 * np.random.standard_normal(len(xn))

The very regression is the same:


In  [ 24 ]: reg =   np.polyfit(xn,  yn,  7 )
ry = np.polyval(reg, xn)

Figure 9-7 reveals that the regression results are closer to the original function than the


noisy data points. In a sense, the regression averages out the noise to some extent:


In  [ 25 ]: plt.plot(xn,    yn, ‘b^’,   label=‘f(x)’)
plt.plot(xn, ry, ‘ro’, label=‘regression’)
plt.legend(loc= 0 )
plt.grid(True)
plt.xlabel(‘x’)
plt.ylabel(‘f(x)’)

Figure 9-7. Regression with noisy data

Unsorted data


Another important aspect of regression is that the approach also works seamlessly with


unsorted data. The previous examples all rely on sorted x data. This does not have to be


the case. To make the point, let us randomize the independent data points as follows:


In  [ 26 ]: xu  =   np.random.rand( 50 )    *    4  *   np.pi   -    2  *   np.pi
yu = f(xu)

In this case, you can hardly identify any structure by just visually inspecting the raw data:


In  [ 27 ]: print xu[: 10 ].round( 2 )
print yu[: 10 ].round( 2 )
Free download pdf