Chapter 8 ■ CaChes and Message Queues
14 8
digits match the filter string. Your pair of subscribers, then, are guaranteed to receive every bit string produced by the
bitsource since among their four filters is every possible combination of two leading binary digits.
The relationship between judge and pythagoras is a classic RPC request-and-reply where the client holding
the REQ socket must speak first in order to assign its message to one of the waiting agents that are attached to its
socket. (In this case, of course, only one agent is attached.) The messaging fabric automatically adds a return address
to the request behind the scenes. Once the agent is done with its work and replies, the return address can be used
to transmit the reply over the REP socket so that it will arrive at the correct client, even if dozens or hundreds are
currently attached.
Finally, the tally worker illustrates the way that a push-pull arrangement guarantees that each item pushed will
be received by one, and only one, of the agents connected to the socket; if you were to start up several tally workers,
then each new datum from upstream would arrive at only one of them, and they would each converge separately on p.
Note that, unlike in all of the other socket programming featured in in this book, this listing does not have to be
at all careful about whether bind() or connect() occurs first! This is a feature of ØMQ, which uses timeouts and
polling to keep retrying a failed connect() behind the scenes in case the endpoint described by the URL comes up
later. This makes it robust against agents that come and go while an application is running.
The resulting system of workers, when run, is able to compute p to about three digits on my laptop by the time
the program exits.
$ python queuepi.py
...
Y 3.1406089633937735
This modest example may make ØMQ programming look overly simple. In real life, you will typically want more
sophisticated patterns than the ones provided here in order to assure the delivery of messages, persist them in case
they cannot yet be processed, and do flow control to make sure that a slow agent will not be overwhelmed by the
number of messages that eventually wind up queued and waiting for it. See the official documentation for extended
discussions of how to implement these patterns for a production service. In the end, many programmers find that a
full-fledged message broker like RabbitMQ, Qpid, or Redis behind Celery gives them the assurances that they want
with the least work and potential for mistakes.
Summary
Serving thousands or millions of customers has become a routine assignment for application developers in the
modern world. Several key technologies have emerged to help them meet this scale, and they can easily be accessed
from Python.
One popular service is Memcached, which combines the free RAM across all of the servers on which it is installed
into a single large LRU cache. As long as you have some procedure for invalidating or replacing entries that become
out of date—or are dealing with data that can be expired on a fixed, predictable schedule—Memcached can remove
a massive amount of load from your database or other back-end storage. It can be inserted at several different points
in your processing. Instead of saving the result of an expensive database query, for example, it might be even better
simply to cache the web widget that ultimately gets rendered.
Message queues are another general mechanism that provide a point of coordination and integration for
different parts of your application, which may require different hardware, load balancing techniques, platforms, or
even programming languages. They can take responsibility for distributing messages among many waiting consumers
or servers in a way that is not possible with the single point-to-point links offered by normal TCP sockets, and they can
also use a database or other persistent storage to assure that messages are not lost if the server goes down. Message
queues also offer resilience and flexibility, since, if some part of your system temporarily becomes a bottleneck, the
message queue can then absorb the shock by allowing many messages to queue up for that service. By hiding the
population of servers or processes that serve a particular kind of request, the message queue pattern also makes it
easy to disconnect, upgrade, reboot, and reconnect servers without the rest of your infrastructure noticing.