Python for Finance: Analyze Big Financial Data

Figure 11-15 shows the results for normalized data — already not too bad, given the rather

simple application of the approach:

In [ 14 ]: import matplotlib.pyplot as plt %matplotlib inline dax.apply(scale_function).plot(figsize=( 8 , 4 ))

Figure 11-15. German DAX index and PCA index with one component

Let us see if we can improve the results by adding more components. To this end, we need

to calculate a weighted average from the single resulting components:

In [ 15 ]: pca = KernelPCA(n_components= 5 ).fit(data.apply(scale_function)) pca_components = pca.transform(-data) weights = get_we(pca.lambdas_) dax[‘PCA_5’] = np.dot(pca_components, weights)

The results as presented in Figure 11-16 are still “good,” but not that much better than

before — at least upon visual inspection:

In [ 16 ]: import matplotlib.pyplot as plt %matplotlib inline dax.apply(scale_function).plot(figsize=( 8 , 4 ))

Figure 11-16. German DAX index and PCA indices with one and five components

In view of the results so far, we want to inspect the relationship between the DAX index

and the PCA index in a different way — via a scatter plot, adding date information to the

mix. First, we convert the DatetimeIndex of the DataFrame object to a matplotlib-

compatible format:

In [ 17 ]: import matplotlib as mpl mpl_dates = mpl.dates.date2num(data.index)