Out[8]: numbers 100
dtype: int64
In [ 9 ]: df.apply(lambda x: x ** 2 ) # square of every element
Out[9]: numbers
a 100
b 400
c 900
d 1600
In general, you can implement the same vectorized operations on a DataFrame object as on
a NumPy ndarray object:
In [ 10 ]: df ** 2 # again square, this time NumPy-like
Out[10]: numbers
a 100
b 400
c 900
d 1600
Enlarging the DataFrame object in both dimensions is possible:
In [ 11 ]: df[‘floats’] = (1.5, 2.5, 3.5, 4.5)
# new column is generated
df
Out[11]: numbers floats
a 10 1.5
b 20 2.5
c 30 3.5
d 40 4.5
In [ 12 ]: df[‘floats’] # selection of column
Out[12]: a 1.5
b 2.5
c 3.5
d 4.5
Name: floats, dtype: float64
A whole DataFrame object can also be taken to define a new column. In such a case,
indices are aligned automatically:
In [ 13 ]: df[‘names’] = pd.DataFrame([‘Yves’, ‘Guido’, ‘Felix’, ‘Francesc’],
index=[‘d’, ‘a’, ‘b’, ‘c’])
df
Out[13]: numbers floats names
a 10 1.5 Guido
b 20 2.5 Felix
c 30 3.5 Francesc
d 40 4.5 Yves
Appending data works similarly. However, in the following example we see a side effect
that is usually to be avoided — the index gets replaced by a simple numbered index:
In [ 14 ]: df.append({‘numbers’: 100 , ‘floats’: 5.75, ‘names’: ‘Henry’},
ignore_index=True)
# temporary object; df not changed
Out[14]: numbers floats names
0 10 1.50 Guido
1 20 2.50 Felix
2 30 3.50 Francesc
3 40 4.50 Yves
4 100 5.75 Henry
It is often better to append a DataFrame object, providing the appropriate index
information. This preserves the index:
In [ 15 ]: df = df.append(pd.DataFrame({‘numbers’: 100 , ‘floats’: 5.75,
‘names’: ‘Henry’}, index=[‘z’,]))
df
Out[15]: floats names numbers
a 1.50 Guido 10
b 2.50 Felix 20