No2 0.049462
No3 0.570157
No4 -0.327594
dtype: float64
In [ 29 ]: df.cumsum()
Out[29]: No1 No2 No3 No4
2015-01-31 -0.737304 1.065173 0.073406 1.301174
2015-02-28 -1.526122 0.079354 0.477201 -0.452609
2015-03-31 -1.682003 -1.673318 1.514645 -0.853403
2015-04-30 -2.459549 0.056960 1.931759 -0.669323
2015-05-31 -4.223209 -0.318508 2.030438 -2.223147
2015-06-30 -5.357467 1.083313 3.257562 -1.243758
2015-07-31 -4.898629 0.940126 4.823263 -3.329621
2015-08-31 -5.001687 0.573956 4.345227 -3.362430
2015-09-30 -3.961370 0.445156 5.131414 -2.948346
There is also a shortcut to a number of often-used statistics for numerical data sets, the
describe method:
In [ 30 ]: df.describe()
Out[30]: No1 No2 No3 No4
count 9.000000 9.000000 9.000000 9.000000
mean -0.440152 0.049462 0.570157 -0.327594
std 0.847907 1.141676 0.642904 1.219345
min -1.763660 -1.752672 -0.478036 -2.085863
25% -0.788818 -0.375469 0.098678 -1.553824
50% -0.737304 -0.143187 0.417114 -0.032810
75% -0.103058 1.065173 1.037444 0.414084
max 1.040318 1.730278 1.565701 1.301174
You can also apply the majority of NumPy universal functions to DataFrame objects:
In [ 31 ]: np.sqrt(df)
Out[31]: No1 No2 No3 No4
2015-01-31 NaN 1.032072 0.270935 1.140690
2015-02-28 NaN NaN 0.635449 NaN
2015-03-31 NaN NaN 1.018550 NaN
2015-04-30 NaN 1.315400 0.645844 0.429045
2015-05-31 NaN NaN 0.314131 NaN
2015-06-30 NaN 1.183985 1.107756 0.989641
2015-07-31 0.677376 NaN 1.251280 NaN
2015-08-31 NaN NaN NaN NaN
2015-09-30 1.019960 NaN 0.886672 0.643494
NUMPY UNIVERSAL FUNCTIONS
In general, you can apply NumPy universal functions to pandas DataFrame objects whenever they could be applied
to an ndarray object containing the same data.
pandas is quite error tolerant, in the sense that it captures errors and just puts a NaN value
where the respective mathematical operation fails. Not only this, but as briefly shown
already, you can also work with such incomplete data sets as if they were complete in a
number of cases:
In [ 32 ]: np.sqrt(df).sum()
Out[32]: No1 1.697335
No2 3.531458
No3 6.130617
No4 3.202870
dtype: float64
In such cases, pandas just leaves out the NaN values and only works with the other
available values. Plotting of data is also only one line of code away in general (cf.
Figure 6-1):
In [ 33 ]: %matplotlib inline
df.cumsum().plot(lw=2.0)