Python for Finance: Analyze Big Financial Data

(Elle) #1

Chapter 6. Financial Time Series


The only reason for time is so that everything doesn’t happen at once.

— Albert Einstein

One of the most important types of data one encounters in finance are financial time


series. This is data indexed by date and/or time. For example, prices of stocks represent


financial time series data. Similarly, the USD-EUR exchange rate represents a financial


time series; the exchange rate is quoted in brief intervals of time, and a collection of such


quotes then is a time series of exchange rates.


There is no financial discipline that gets by without considering time an important factor.


This mainly is the same as with physics and other sciences. The major tool to cope with


time series data in Python is the library pandas. Wes McKinney, the main author of


pandas, started developing the library when working as an analyst at AQR Capital


Management, a large hedge fund. It is safe to say that pandas has been designed from the


ground up to work with financial time series. As this chapter demonstrates, the main


inspiration for the fundamental classes, such as the DataFrame and Series classes, is


drawn from the R statistical analysis language, which without doubt has a strength in that


kind of modeling and analysis.


The chapter is mainly based on a couple of examples drawn from a financial context. It


proceeds along the following lines:


First and second steps


We start exploring the capabilities of pandas by using very simple and small data


sets; we then proceed by using a NumPy ndarray object and transforming this to a


DataFrame object. As we go, basic analytics and visualization capabilities are


illustrated.


Data from the Web


pandas allows us to conveniently retrieve data from the Web — e.g., from Yahoo!


Finance — and to analyze such data in many ways.


Using data from CSV files


Comma-separated value (CSV) files represent a global standard for the exchange of


financial time series data; pandas makes reading data from such files an efficient


task. Using data for two indices, we implement a regression analysis with pandas.


High-frequency data


In recent years, available financial data has increasingly shifted from daily quotes to


tick data. Daily tick data volumes for a stock price regularly surpass those volumes of


daily data collected over 30 years.


[ 24 ]

All financial time series data contains date and/or time information, by definition.


Appendix C provides an overview of how to handle such data with Python, NumPy, and


pandas as well as of how to convert typical date-time object types into each other.

Free download pdf