Python for Finance: Analyze Big Financial Data

(Elle) #1

Portfolio Optimization


Modern or mean-variance portfolio theory (MPT) is a major cornerstone of financial


theory. Based on this theoretical breakthrough the Nobel Prize in Economics was awarded


to its inventor, Harry Markowitz, in 1990. Although formulated in the 1950s,


[ 41 ]

it is still a


theory taught to finance students and applied in practice today (often with some minor or


major modifications). This section illustrates the fundamental principles of the theory.


Chapter 5 in the book by Copeland, Weston, and Shastri (2005) provides a good


introduction to the formal topics associated with MPT. As pointed out previously, the


assumption of normally distributed returns is fundamental to the theory:


By looking only at mean and variance, we are necessarily assuming that no other statistics are necessary to

describe the distribution of end-of-period wealth. Unless investors have a special type of utility function

(quadratic utility function), it is necessary to assume that returns have a normal distribution, which can be

completely described by mean and variance.

The Data


Let us begin our Python session by importing a couple of by now well-known libraries:


In  [ 33 ]: import numpy as np
import pandas as pd
import pandas.io.data as web
import matplotlib.pyplot as plt
%matplotlib inline

We pick five different assets for the analysis: American tech stocks Apple Inc., Yahoo!


Inc., and Microsoft Inc., as well as German Deutsche Bank AG and gold as a commodity


via an exchange-traded fund (ETF). The basic idea of MPT is diversification to achieve a


minimal portfolio risk or maximal portfolio returns given a certain level of risk. One


would expect such results for the right combination of a large enough number of assets


and a certain diversity in the assets. However, to convey the basic ideas and to show


typical effects, these five assets shall suffice:


In  [ 34 ]: symbols =   [‘AAPL’,    ‘MSFT’, ‘YHOO’, ‘DB’,   ‘GLD’]
noa = len(symbols)

Using the DataReader function of pandas (cf. Chapter 6) makes getting the time series


data rather efficient. We are only interested, as in the previous example, in the Close


prices of each stock:


In  [ 35 ]: data    =   pd.DataFrame()
for sym in symbols:
data[sym] = web.DataReader(sym, data_source=‘yahoo’,
end=‘2014-09-12’)[‘Adj Close’]
data.columns = symbols

Figure 11-11 shows the time series data in normalized fashion graphically:


In  [ 36 ]: (data   /   data.ix[ 0 ]    *    100 ).plot(figsize=( 8 ,    5 ))
Free download pdf