Chapter 9 ■ http Clients
16 4
You can read RFC 7235 to learn about the most recent HTTP authentication mechanisms. The initial steps in the
early days were not encouraging.
The first mechanism, Basic Authentication (or “Basic Auth”), involved the server including a string called a realm
in its 401 Not Authorized headers. The realm string allows a single server to protect different parts of its document
tree with different passwords because the browser can keep up with which user password goes with which realm.
The client then repeats its request with an Authorization header giving the username and password (base-64
encoded, as though that helps), and it is ideally given a 200 reply.
GET / HTTP/1.1
...
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="engineering team"
...
GET / HTTP/1.1
Authorization: Basic YnJhbmRvbjphdGlnZG5nbmF0d3dhbA==
...
HTTP/1.1 200 OK
...
Passing the username and password in the clear sounds unconscionable today, but in that earlier and more
innocent era, there were as yet no wireless networks, and switching equipment tended to be solid-state instead of
running software that could be compromised. As protocol designers began to contemplate the dangers, an updated
“Digest access authentication” scheme was created where the server issues a challenge and the client replies with
an MD5 hash of the challenge-plus-password instead. But the result is still something of a disaster. Even with Digest
authentication in use, your username is still visible in the clear. All form data submitted and all resources returned
from the web site are visible in the clear. An ambitious enough attacker can then launch a man-in-the-middle attack
so that, thinking they are the server, you sign a challenge that they have just themselves received from the server and
which they can turn around and use to impersonate you.
Web sites needed real security if banks wanted to show you your balance and if Amazon wanted you to type in
your credit card information. Thus, SSL was invented to create HTTPS, and it was followed by the various versions of
TLS that you enjoy today, as detailed in Chapter 6.
The addition of TLS meant, in principle, that there was no longer anything wrong with Basic Auth. Many
simple HTTPS-protected APIs and web applications use it today. While urllib supports it only if you build a series of
objects to install in your URL opener (see the documentation for details), Requests supports Basic Auth with a single
keyword parameter.
r = requests.get('http://example.com/api/',
... auth=('brandon', 'atigdngnatwwal'))
You can also prepare a Requests Session for authentication to avoid having to repeat it yourself with every get()
or post().
s = requests.Session()
s.auth = 'brandon', 'atigdngnatwwal'
s.get('http://httpbin.org/basic-auth/brandon/atigdngnatwwal')
<Response [200]>