A Functional Approach to Web Services
In order to avoid details of a specific web framework, we'll focus on the Web Server
Gateway Interface (WSGI) design pattern. This will allow us to implement a simple
web server. A great deal of information is present at the following link:
http://wsgi.readthedocs.org/en/latest/
Some important background of WSGI can be found at
https://www.python.org/dev/peps/pep-0333/
We'll start by looking at the HTTP protocol. From there, we can consider servers
such as Apache httpd to implement this protocol and see how mod_wsgi becomes
a sensible extension to a base server. With this background, we can look at the
functional nature of WSGI and how we can leverage functional design to implement
sophisticated web search and retrieval tools.
The HTTP request-response model
The essential HTTP protocol is, ideally, stateless. A user agent or client can take
a functional view of the protocol. We can build a client using the http.client
or urllib library. An HTTP user agent essentially executes something similar
to the following:
import urllib.request
with urllib.request.urlopen(""http://slott-softwarearchitect.
blogspot.com"") as response:
print(response.read())
A program like wget or curl does this at the command line; the URL is taken from
the arguments. A browser does this in response to the user pointing and clicking;
the URL is taken from the user's actions, in particular, the action of clicking on linked
text or images.
The practical considerations of the internetworking protocols, however, lead to some
implementation details which are stateful. Some of the HTTP status codes indicate
that an additional action on the part of the user agent is required.
Many status codes in the 3xx range indicate that the requested resource has
been moved. The user agent is then required to request a new location based
on information sent in the Location header. The 401 status code indicates that
authentication is required; the user agent can respond with an authorization header
that contains credentials for access to the server. The urllib library implementation
handles this stateful overhead. The http.client library doesn't automatically follow
3xx redirect status codes.