Python for Finance: Analyze Big Financial Data

(Elle) #1
In  [ 25 ]: content =   resp.read()
content[: 100 ]
# first 100 characters of the file
Out[25]: ‘<!doctype html>\n<html lang=“en”>\n\n\t<head>\n\t\t<meta charset=“utf-
8”>\n\n\t\t<title>Dr. Yves J. Hilpisch \xe2\x80’

Once you have the content of a particular web page, there are many potential use cases.


You might want to look up certain information, for example. You might know that you can


find the email address on the page by looking for E (in this very particular case). Since


content is a string object, you can apply the find method to look for E:


[ 52 ]

In  [ 26 ]: index   =   content.find(‘  E   ‘)
index
Out[26]: 2071

Equipped with the index value for the information you are looking for, you can inspect the


subsequent characters of the object:


In  [ 27 ]: content[index:index +    29 ]
Out[27]: ‘ E contact [at] dyjh [dot] de’

Once you are finished, you should again close the connection to the server:


In  [ 28 ]: http.close()

urllib


There is another Python library that supports the use of different web protocols. It is called


urllib. There is also a related library called urllib2. Both libraries are designed to work


with arbitrary web resources, in the spirit of the “uniform” in URL (uniform resource


locator).


[ 53 ]

A standard use case, for example, is to retrieve files, like CSV data files, via the


Web. Begin by importing urllib:


In  [ 29 ]: import urllib

The application of the library’s functions resembles that of both ftplib and httplib. Of


course, we need a URL representing the web resource of interest (HTTP or FTP server, in


general). For this example, we use the URL of Yahoo! Finance to retrieve stock price


information in CSV format:


In  [ 30 ]: url =   ‘http://ichart.finance.yahoo.com/table.csv?g=d&ignore=.csv’
url += ‘&s=YHOO&a=01&b=1&c=2014&d=02&e=6&f=2014’

Next, one has to establish a connection to the resource:


In  [ 31 ]: connect =   urllib.urlopen(url)

With the connection established, read out the content by calling the read method on the


connection object:


In  [ 32 ]: data    =   connect.read()

The result in this case is historical stock price information for Yahoo! itself:


In  [ 33 ]: print data
Out[33]: Date,Open,High,Low,Close,Volume,Adj Close
2014-03-06,39.60,39.98,39.50,39.66,10626700,39.66
2014-03-05,39.83,40.15,39.19,39.50,12536800,39.50
2014-03-04,38.76,39.79,38.68,39.63,16139400,39.63
2014-03-03,37.65,38.66,37.43,38.25,14714700,38.25
2014-02-28,38.55,39.38,38.22,38.67,16957100,38.67
2014-02-27,37.80,38.48,37.74,38.47,15489400,38.47
2014-02-26,37.35,38.10,37.34,37.62,15778900,37.62
2014-02-25,37.48,37.58,37.02,37.26,9756900,37.26
2014-02-24,37.23,37.71,36.82,37.42,15738900,37.42
Free download pdf