Python for Finance: Analyze Big Financial Data

(Elle) #1
In  [ 20 ]: a   =   np.random.standard_normal(( 9 ,  4 ))
a.round( 6 )
Out[20]: array([[-0.737304, 1.065173, 0.073406, 1.301174],
[-0.788818, -0.985819, 0.403796, -1.753784],
[-0.155881, -1.752672, 1.037444, -0.400793],
[-0.777546, 1.730278, 0.417114, 0.184079],
[-1.76366 , -0.375469, 0.098678, -1.553824],
[-1.134258, 1.401821, 1.227124, 0.979389],
[ 0.458838, -0.143187, 1.565701, -2.085863],
[-0.103058, -0.36617 , -0.478036, -0.03281 ],
[ 1.040318, -0.128799, 0.786187, 0.414084]])

Although you can construct DataFrame objects more directly (as we have seen before),


using an ndarray object is generally a good choice since pandas will retain the basic


structure and will “only” add meta-information (e.g., index values). It also represents a


typical use case for financial applications and scientific research in general. For example:


In  [ 21 ]: df  =   pd.DataFrame(a)
df
Out[21]: 0 1 2 3
0 -0.737304 1.065173 0.073406 1.301174
1 -0.788818 -0.985819 0.403796 -1.753784
2 -0.155881 -1.752672 1.037444 -0.400793
3 -0.777546 1.730278 0.417114 0.184079
4 -1.763660 -0.375469 0.098678 -1.553824
5 -1.134258 1.401821 1.227124 0.979389
6 0.458838 -0.143187 1.565701 -2.085863
7 -0.103058 -0.366170 -0.478036 -0.032810
8 1.040318 -0.128799 0.786187 0.414084

Table 6-1 lists the parameters that the DataFrame function takes. In the table, “array-like”


means a data structure similar to an ndarray object — a list, for example. Index is an


instance of the pandas Index class.


Table 6-1. Parameters of DataFrame function


Parameter Format Description

data

ndarray/dict/DataFrame

Data for DataFrame; dict can contain Series, ndarrays, lists

index

Index/array-like

Index to use; defaults to range(n)

columns

Index/array-like

Column headers to use; defaults to range(n)

dtype

dtype, default None

Data type to use/force; otherwise, it is inferred

copy

bool, default None

Copy data from inputs

As with structured arrays, and as we have already seen, DataFrame objects have column


names that can be defined directly by assigning a list with the right number of elements.


This illustrates that you can define/change the attributes of the DataFrame object as you


go:


In  [ 22 ]: df.columns  =   [[‘No1’,    ‘No2’,  ‘No3’,  ‘No4’]]
df
Out[22]: No1 No2 No3 No4
0 -0.737304 1.065173 0.073406 1.301174
Free download pdf