Functional Python Programming

(Wang) #1

The Multiprocessing and Threading Modules


In this case, however, we need to distribute the I/O processing to as many CPUs or
cores as we have available. Most of these potential refactorings will perform all of
the I/O in the parent process; these will only distribute the computations to multiple
concurrent processes with little resulting benefit. Then, we want to focus on the
mappings, as these distribute the I/O to as many cores as possible.


It's often important to minimize the amount of data being passed from process to
process. In this example, we provided just short filename strings to each worker
process. The resulting Counter object was considerably smaller than the 10 MB
of compressed detail data in each logfile. We can further reduce the size of each
Counter object by eliminating items that occur only once; or we can limit our
application to only the 20 most popular items.


The fact that we can reorganize the design of this application freely doesn't mean
we should reorganize the design. We can run a few benchmarking experiments to
confirm our suspicion that logfile parsing is dominated by the time required to read
the files.


Summary


In this chapter, we've looked at two ways to support concurrent processing of
multiple pieces of data:



  • The multiprocessing module: Specifically, the Pool class and the various
    kinds of mappings available to a pool of workers.

  • The concurrent.futures module: Specifically the ProcessPoolExecutor
    and ThreadPoolExecutor class. These classes also support a mapping that
    will distribute work among workers that are threads or processes.


We've also noted some alternatives that don't seem to fit well with functional
programming. There are numerous other features of the multiprocessing module,
but they're not a good fit with functional design. Similarly, the threading and queue
modules can be used to build multithreaded applications, but the features aren't a
good fit with functional programs.


In the next chapter, we'll look at the operator module. This can be used to simplify
some kinds of algorithms. We can use a built-in operator function instead of defining
a lambda form. We'll also look at some techniques to design flexible decision making
and allow expressions to be evaluated in a non-strict order.

Free download pdf