Python for Finance: Analyze Big Financial Data

(Elle) #1
2015-03-31 -0.155881

2015-04-30 -0.777546

2015-05-31 -1.763660

2015-06-30 -1.134258

2015-07-31 0.458838

2015-08-31 -0.103058

2015-09-30 1.040318

                                    Freq:   M,  Name:   No1,    dtype:  float64
In [ 36 ]: type(df[‘No1’])
Out[36]: pandas.core.series.Series

The main DataFrame methods are available for Series objects as well, and we can, for


instance, plot the results as before (cf. Figure 6-2):


In  [ 37 ]: import matplotlib.pyplot as plt
df[‘No1’].cumsum().plot(style=‘r’, lw=2.)
plt.xlabel(‘date’)
plt.ylabel(‘value’)

Figure 6-2. Line plot of a Series object

GroupBy Operations


pandas has powerful and flexible grouping capabilities. They work similarly to grouping


in SQL as well as pivot tables in Microsoft Excel. To have something to group by, we add a


column indicating the quarter the respective data of the index belongs to:


In  [ 38 ]: df[‘Quarter’]   =   [‘Q1’,  ‘Q1’,   ‘Q1’,   ‘Q2’,   ‘Q2’,   ‘Q2’,   ‘Q3’,   ‘Q3’,   ‘Q3’]
df
Out[38]: No1 No2 No3 No4 Quarter
2015-01-31 -0.737304 1.065173 0.073406 1.301174 Q1
2015-02-28 -0.788818 -0.985819 0.403796 -1.753784 Q1
2015-03-31 -0.155881 -1.752672 1.037444 -0.400793 Q1
2015-04-30 -0.777546 1.730278 0.417114 0.184079 Q2
2015-05-31 -1.763660 -0.375469 0.098678 -1.553824 Q2
2015-06-30 -1.134258 1.401821 1.227124 0.979389 Q2
2015-07-31 0.458838 -0.143187 1.565701 -2.085863 Q3
2015-08-31 -0.103058 -0.366170 -0.478036 -0.032810 Q3
2015-09-30 1.040318 -0.128799 0.786187 0.414084 Q3

Now, we can group by the “Quarter” column and can output statistics for the single


groups:


In  [ 39 ]: groups  =   df.groupby(‘Quarter’)

For example, we can easily get the mean, max, and size of every group bucket as follows:


In  [ 40 ]: groups.mean()
Out[40]: No1 No2 No3 No4
Quarter
Q1 -0.560668 -0.557773 0.504882 -0.284468
Free download pdf