Figure 11-15 shows the results for normalized data — already not too bad, given the rather
simple application of the approach:
In [ 14 ]: import matplotlib.pyplot as plt
%matplotlib inline
dax.apply(scale_function).plot(figsize=( 8 , 4 ))
Figure 11-15. German DAX index and PCA index with one component
Let us see if we can improve the results by adding more components. To this end, we need
to calculate a weighted average from the single resulting components:
In [ 15 ]: pca = KernelPCA(n_components= 5 ).fit(data.apply(scale_function))
pca_components = pca.transform(-data)
weights = get_we(pca.lambdas_)
dax[‘PCA_5’] = np.dot(pca_components, weights)
The results as presented in Figure 11-16 are still “good,” but not that much better than
before — at least upon visual inspection:
In [ 16 ]: import matplotlib.pyplot as plt
%matplotlib inline
dax.apply(scale_function).plot(figsize=( 8 , 4 ))
Figure 11-16. German DAX index and PCA indices with one and five components
In view of the results so far, we want to inspect the relationship between the DAX index
and the PCA index in a different way — via a scatter plot, adding date information to the
mix. First, we convert the DatetimeIndex of the DataFrame object to a matplotlib-
compatible format:
In [ 17 ]: import matplotlib as mpl
mpl_dates = mpl.dates.date2num(data.index)