Foundations of Python Network Programming

(WallPaper) #1

Chapter 7 ■ Server arChiteCture


116


then the load balancer simply stops forwarding requests there until it comes back up, which can make server failures
nearly invisible to a large client base. The biggest Internet services combine these approaches: a load balancer and
server farm in each machine room with a public DNS name that returns the IP addresses for the load balancer whose
machine room appears to be closest to you geographically.
However simple or grandiose your service architecture, you will need some way of running your Python server
code on a physical or virtual machine, a process called deployment. There are two schools of thought regarding
deployment. The old-fashioned technique is to suit up every single server program you write with all of the features
of a service: double-forking to become a Unix daemon (or registering itself as a Windows service), arranging for
system-level logging, supporting a configuration file, and offering a mechanism by which it can be started up, shut
down, and restarted. You can do this either by using a third-party library that has solved these problems already or
by doing it all over again in your own code.
A competing approach has been popularized by manifestos like The Twelve-Factor App. They advocate a minimalist
approach in which each service is written as a normal program that runs in the foreground and makes no effort to
become a daemon. Such a program takes any configuration options that it needs from its environment (the sys.environ
dictionary in Python) instead of expecting a system-wide configuration file. It connects to any back-end services that
the environment names. And it prints its logging messages directly to the screen—even through as naïve a mechanism
as Python’s own print() function. Network requests are accepted by opening and listening at whatever port the
environment configuration dictates.
A service written in this minimalist style is easy for developers to run right at a shell prompt for testing. Yet it
can then be made into a daemon or system service or deployed to a web-scale server farm by simply surrounding
the application with the right scaffolding. The scaffolding could, for example, pull the environment variable settings
from a central configuration service, connect the application’s standard output and standard error to a remote logging
server, and restart the service if it either fails or seems to freeze up. Because the program itself does not know this and
is simply printing to standard output as usual, the programmer has the confidence that the service code is running in
production exactly as it runs when under development.
There are now large platform-as-a-service providers that will host such applications for you, spinning up dozens
or even hundreds of copies of your application behind a single public-facing domain name and TCP load balancer
and then aggregating all of the resulting logs for analysis. Some providers allow you to submit Python application code
directly. Others prefer that you bundle up your code, a Python interpreter, and any dependencies you need inside a
container (“Docker” containers in particular are becoming a popular mechanism) that can be tested on your own
laptop and then deployed, assuring you that your Python code will run in production from an image that is byte-for-
byte identical to the one you use in testing. Either way, you are absolved from writing a service that spawns multiple
processes itself; all redundancy/duplication of your service is handled by the platform.
More modest efforts at getting programmers out of the business of having to write stand-alone services have
long existed in the Python community. The popular supervisord tool is an excellent example. It can run one or more
copies of your program, divert your standard output and error to log files, restart a process if it fails, and even send
alerts if a service begins failing too frequently.
If, despite all of these temptations, you do decide to write a process that knows how to turn itself into a daemon,
you should find good patterns for doing so available in the Python community. A good starting point is PEP 3143
(available at http://python.org)) whose section “Other daemon implementations” is a well-curated list of resources
on the steps required. The supervisord source code might also be of interest, along with the documentation for
Python’s Standard Library module logging.
Whether you have a stand-alone Python process or a platform-powered web-scale service, the question of how
you can most efficiently use an operating system network stack plus an operating system process to serve network
requests is the same. It is to this problem that you will turn your attention for the rest of the chapter, with the goal of
keeping the system as busy as possible so that clients wait as little as possible before having their network requests
answered.

Free download pdf