Python for Finance: Analyze Big Financial Data

(Elle) #1
Out[8]: numbers             100
dtype: int64
In [ 9 ]: df.apply(lambda x: x ** 2 ) # square of every element
Out[9]: numbers
a 100
b 400
c 900
d 1600

In general, you can implement the same vectorized operations on a DataFrame object as on


a NumPy ndarray object:


In  [ 10 ]: df  **   2       #  again   square, this    time    NumPy-like
Out[10]: numbers
a 100
b 400
c 900
d 1600

Enlarging the DataFrame object in both dimensions is possible:


In  [ 11 ]: df[‘floats’]    =   (1.5,   2.5,    3.5,    4.5)
# new column is generated
df
Out[11]: numbers floats
a 10 1.5
b 20 2.5
c 30 3.5
d 40 4.5
In [ 12 ]: df[‘floats’] # selection of column
Out[12]: a 1.5
b 2.5
c 3.5
d 4.5
Name: floats, dtype: float64

A whole DataFrame object can also be taken to define a new column. In such a case,


indices are aligned automatically:


In  [ 13 ]: df[‘names’] =   pd.DataFrame([‘Yves’,   ‘Guido’,    ‘Felix’,    ‘Francesc’],
index=[‘d’, ‘a’, ‘b’, ‘c’])
df
Out[13]: numbers floats names
a 10 1.5 Guido
b 20 2.5 Felix
c 30 3.5 Francesc
d 40 4.5 Yves

Appending data works similarly. However, in the following example we see a side effect


that is usually to be avoided — the index gets replaced by a simple numbered index:


In  [ 14 ]: df.append({‘numbers’:    100 ,  ‘floats’:   5.75,   ‘names’:    ‘Henry’},
ignore_index=True)
# temporary object; df not changed
Out[14]: numbers floats names
0 10 1.50 Guido
1 20 2.50 Felix
2 30 3.50 Francesc
3 40 4.50 Yves
4 100 5.75 Henry

It is often better to append a DataFrame object, providing the appropriate index


information. This preserves the index:


In  [ 15 ]: df  =   df.append(pd.DataFrame({‘numbers’:   100 ,  ‘floats’:   5.75,
‘names’: ‘Henry’}, index=[‘z’,]))
df
Out[15]: floats names numbers
a 1.50 Guido 10
b 2.50 Felix 20
Free download pdf