Foundations of Python Network Programming

(WallPaper) #1
Chapter 9 ■ http Clients

167

Note that an HTTPConnection object that gets hung up on will not return an error, but it will silently create a new
TCP connection to replace the old one when you ask it to perform another request. The HTTPSConnection class offers
a TLS-protected version of the same object.
The Requests library Session object, by contrast, is backed by a third-party package named urllib3 that will
maintain a connection pool of open connections to HTTP servers with which you have recently communicated so that
it can attempt to reuse them automatically when you ask it for another resource from the same site.


Summary


The HTTP protocol is used to fetch resources based on their hostname and path. The urllib client in the Standard
Library will work in simple cases, but it is underpowered and lacks the features of Requests, an Internet sensation of a
Python library that is the go-to tool of programmers wanting to fetch information from the Web.
HTTP runs in the clear on port 80, under the protection of TLS on port 443, and it uses the same basic layout on
the wire for the client request and the server response: a line of information followed by name-value headers, finally
followed by a blank line, and then, optionally, a body that can be encoded and delimited in several different ways. The
client always speaks first, sending a request, and then it waits until the server has completed a response.
The most common HTTP methods are GET, for fetching a resource, and POST, for sending updated information
to a server. Several other methods exist, but they each tend to be either something like GET or something like POST.
The server returns a status code with each response indicating whether the request has simply succeeded or simply
failed or whether the client needs to be redirected to go load another resource in order to finish.
There are several concentric layers of design built into HTTP. Caching headers might allow a resource to be
cached and reused repeatedly on a client without being fetched again, or the headers might let the server skip
redelivering a resource that has not changed. Both optimizations can be crucial to the performance of busy sites.
Content negotiation holds the promise of tailoring data formats and human languages to the exact preferences of
the client and the human using it, but it runs into problems in practice that makes it less than universally employed.
Built-in HTTP authentication was a poor design for interactive use, having been replaced with custom login pages and
cookies, but Basic Auth is sometimes still used to authenticate requests to TLS-secured APIs.
HTTP/1.1 connections can survive and be reused by default, and the Requests library is careful to do so whenever
possible.
In the next chapter, you will take all that you have learned here, and reversing the perspective, you will look at the
task of programming from the point of view of writing a server.

Free download pdf