Python for Finance: Analyze Big Financial Data

(Elle) #1
Out[32]:    GLD
GDX
<class ‘pandas.core.frame.DataFrame’>
DatetimeIndex: 1967 entries, 2006-05-22 00:00:00+00:00 to 2014-03-14 00
:00:00+00:00
Data columns (total 2 columns):
GDX 1967 non-null float64
GLD 1967 non-null float64
dtypes: float64(2)

Figure 11-22 shows the historical data for both ETFs:


In  [ 33 ]: data.plot(figsize=( 8 ,  4 ))

Figure 11-22. Comovements of trading pair

The absolute performance differs significantly:


In  [ 34 ]: data.ix[- 1 ]   /   data.ix[ 0 ]    -    1
Out[34]: GDX -0.216002
GLD 1.038285
dtype: float64

However, both time series seem to be quite strongly positively correlated when inspecting


Figure 11-22, which is also reflected in the correlation data:


In  [ 35 ]: data.corr()
Out[35]: GDX GLD
GDX 1.000000 0.466962
GLD 0.466962 1.000000

As usual, the DatetimeIndex object of the DataFrame object consists of Timestamp


objects:


In  [ 36 ]: data.index
Out[36]: <class ‘pandas.tseries.index.DatetimeIndex’>
[2006-05-22, ..., 2014-03-14]
Length: 1967, Freq: None, Timezone: UTC

To use the date-time information with matplotlib in the way we want to in the following,


we have to first convert it to an ordinal date representation:


In  [ 37 ]: import matplotlib as mpl
mpl_dates = mpl.dates.date2num(data.index)
mpl_dates
Out[37]: array([ 732453., 732454., 732455., ..., 735304., 735305., 735306.])

Figure 11-23 shows a scatter plot of the time series data, plotting the GLD values against


the GDX values and illustrating the dates of each data pair with different colorings:


[ 47 ]

In  [ 38 ]: plt.figure(figsize=( 8 ,     4 ))
plt.scatter(data[‘GDX’], data[‘GLD’], c=mpl_dates, marker=‘o’)
plt.grid(True)
Free download pdf