elle
(Elle)
#1
Chapter 11. Statistics
I can prove anything by statistics except the truth.
— George Canning
Statistics is a vast field. The tools and results the field provides have become indispensible
for finance. This also explains the popularity of domain-specific languages like R in the
finance industry. The more elaborate and complex statistical models become, the more
important it is to have available easy-to-use and high-performing computational solutions.
A single chapter in a book like this one cannot do justice to the richness and the broadness
of the field of statistics. Therefore, the approach — as in many other chapters — is to
focus on selected topics that seem of paramount importance or that provide a good starting
point when it comes to the use of Python for the particular tasks at hand. The chapter has
four focal points:
Normality tests
A large number of important financial models, like the mean-variance portfolio
theory and the capital asset pricing model (CAPM), rest on the assumption that
returns of securities are normally distributed; therefore, this chapter presents some
approaches to test a given time series for normality of returns.
Portfolio theory
Modern portfolio theory (MPT) can be considered one of the biggest successes of
statistics in finance; starting in the early 1950s with the work of pioneer Harry
Markowitz, this theory began to replace people’s reliance on judgment and
experience with rigorous mathematical and statistical methods when it comes to the
investment of money in financial markets. In that sense, it is maybe the first real
quantitative approach in finance.
Principal component analysis
Principal component analysis (PCA) is quite a popular tool in finance, for example,
when it comes to implementing equity investment strategies or analyzing the
principal components that explain the movement in interest rates. Its major benefit is
“complexity reduction,” achieved by deriving a small set of linearly independent
(noncorrelated, orthogonal) components from a potentially large set of maybe highly
correlated time series components; we illustrate the application based on the German
DAX index and the 30 stocks contained in that index.
Bayesian regression
On a fundamental level, Bayesian statistics introduces the notion of beliefs of agents
and the updating of beliefs to statistics; when it comes to linear regression, for
example, this might take on the form of having a statistical distribution for regression
parameters instead of single point estimates (e.g., for the intercept and slope of the
regression line). Nowadays, Bayesian methods are rather popular and important in
finance, which is why we illustrate some (advanced) applications in this chapter.
Many aspects in this chapter relate to date and/or time information. Refer to Appendix C