In [ 20 ]: a = np.random.standard_normal(( 9 , 4 ))
a.round( 6 )
Out[20]: array([[-0.737304, 1.065173, 0.073406, 1.301174],
[-0.788818, -0.985819, 0.403796, -1.753784],
[-0.155881, -1.752672, 1.037444, -0.400793],
[-0.777546, 1.730278, 0.417114, 0.184079],
[-1.76366 , -0.375469, 0.098678, -1.553824],
[-1.134258, 1.401821, 1.227124, 0.979389],
[ 0.458838, -0.143187, 1.565701, -2.085863],
[-0.103058, -0.36617 , -0.478036, -0.03281 ],
[ 1.040318, -0.128799, 0.786187, 0.414084]])
Although you can construct DataFrame objects more directly (as we have seen before),
using an ndarray object is generally a good choice since pandas will retain the basic
structure and will “only” add meta-information (e.g., index values). It also represents a
typical use case for financial applications and scientific research in general. For example:
In [ 21 ]: df = pd.DataFrame(a)
Out[21]: 0 1 2 3
0 -0.737304 1.065173 0.073406 1.301174
1 -0.788818 -0.985819 0.403796 -1.753784
2 -0.155881 -1.752672 1.037444 -0.400793
3 -0.777546 1.730278 0.417114 0.184079
4 -1.763660 -0.375469 0.098678 -1.553824
5 -1.134258 1.401821 1.227124 0.979389
6 0.458838 -0.143187 1.565701 -2.085863
7 -0.103058 -0.366170 -0.478036 -0.032810
8 1.040318 -0.128799 0.786187 0.414084
Table 6-1 lists the parameters that the DataFrame function takes. In the table, “array-like”
means a data structure similar to an ndarray object — a list, for example. Index is an
instance of the pandas Index class.
Table 6-1. Parameters of DataFrame function
Parameter Format Description
Data for DataFrame; dict can contain Series, ndarrays, lists
Index to use; defaults to range(n)
Column headers to use; defaults to range(n)
dtype, default None
Data type to use/force; otherwise, it is inferred
bool, default None
Copy data from inputs
As with structured arrays, and as we have already seen, DataFrame objects have column
names that can be defined directly by assigning a list with the right number of elements.
This illustrates that you can define/change the attributes of the DataFrame object as you
In [ 22 ]: df.columns = [[‘No1’, ‘No2’, ‘No3’, ‘No4’]]
Out[22]: No1 No2 No3 No4
0 -0.737304 1.065173 0.073406 1.301174