Foundations of Python Network Programming

(WallPaper) #1
Chapter 9 ■ http Clients

153

libraries but need to perform advanced HTTP operations, then you will want to consult not only the urllib library’s
own documentation but also two other resources: its Python Module of the Week entry and the chapter on HTTP in
the online Dive Into Python book.


http://pymotw.com/2/urllib2/index.html#module-urllib2
http://www.diveintopython.net/http_web_services/index.html


These resources were both written in the days of Python 2 and therefore call the library urllib2 instead of
urllib.request, but you should find that they still work as a basic guide to urllib’s awkward and outdated
object-oriented design.


Ports, Encryption, and Framing


Port 80 is the standard port for plain-text HTTP conversations. Port 443 is the standard port for clients that want first
to negotiate an encrypted TLS conversation (see Chapter 6) and then begin speaking HTTP only once the encryption
has been established—a variant of the protocol that is named Hypertext Transfer Protocol Secure (HTTPS). Inside the
encrypted channel, HTTP is spoken exactly as it would be normally over an unencrypted socket.
As you will learn in Chapter 11, the choice between HTTP and HTTPS and between the standard or a nonstandard
port is generally expressed, from the point of view of the user, in the URLs that they construct or are given.
Remember that the purpose of TLS is not only to protect traffic from eavesdropping but also to verify the identity
of the server to which the client is connecting (moreover, if a client certificate is presented, to allow the server to verify
the client identity in return). Never use an HTTPS client that does not perform a check of whether the certificate
presented by the server matches the hostname to which the client is attempting to connect. All of the clients covered
in this chapter do perform such a check.
In HTTP, it is the client that speaks first, transmitting a request that names a document. Once the entire request
is on the wire, the client then waits until it has received a complete response from the server that either indicates an
error condition or provides information about the document that the client has requested. The client, at least in the
HTTP/1.1 version of the protocol that is popular today, is not permitted to begin transmitting a second request over
the same socket until the response is finished.
There is an important symmetry built into HTTP: the request and response use the same rules to establish
formatting and framing. Here is an example request and response to which you can refer as you read the description
of the protocol that follows:


GET /ip HTTP/1.1
User-Agent: curl/7.35.0
Host: localhost:8000
Accept: /


HTTP/1.1 200 OK
Server: gunicorn/19.1.1
Date: Sat, 20 Sep 2014 00:18:00 GMT
Connection: close
Content-Type: application/json
Content-Length: 27
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true


{
"origin": "127.0.0.1"
}

Free download pdf