Learning Python Network Programming

(Sean Pound) #1
Chapter 2

But Requests also performs automatic decoding for us. To get the decoded content,
do this:





response.text





'<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">\n<html
lang="en">\n\n


...


Notice that this is now str rather than bytes. The Requests library uses values in
the headers for choosing a character set and decoding the content to Unicode for
us. If it can't get a character set from the headers, then it uses the chardet library
(http://pypi.python.org/pypi/chardet) to make an estimate from the content
itself. We can see what encoding Requests has chosen here:





response.encoding





'ISO-8859-1'


We can even ask it to change the encoding that it has used:





response.encoding = 'utf-8'





After changing the encoding, subsequent references to the text attribute for this
response will return the content decoded by using the new encoding setting.


The Requests library automatically handles cookies. Give the following a try:





response = requests.get('http://www.github.com')








print(response.cookies)





<<class 'requests.cookies.RequestsCookieJar'>


[<Cookie logged_in=no for .github.com/>,


<Cookie _gh_sess=eyJzZxNz... for ..github.com/>]>


The Requests library also has a Session class, which allows the reuse of
cookies, and this is similar to using the http module's CookieJar and the
urllib module's HTTPCookieHandler objects. Do the following to reuse the
cookies in subsequent requests:





s = requests.Session()








s.get('http://www.google.com')








response = s.get('http://google.com/preferences')




Free download pdf