Chapter 12
On a computer with multiple CPUs or multiple cores in a single CPU, there can
be some actual concurrent processing of CPU instructions. All other concurrency
is simulated through time slicing at the OS level. A Mac OS X laptop can have 200
concurrent processes that share the CPU; this is far more processes than the number
of available cores. From this, we can see that the OS time slicing is responsible for
most of the apparently concurrent behavior.
The boundary conditions
Let's consider a hypothetical algorithm which has On()^2. Assume that there is an
inner loop that involves 1,000 bytes of Python code. When processing 10,000 objects,
we're executing 100 billion Python operations. This is the essential processing
budget. We can try to allocate as many processes and threads as we feel might be
helpful, but the processing budget can't change.
The individual CPython bytecode doesn't have a simple execution timing. However,
a long-term average on a Mac OS X laptop shows that we can expect about 60 MB of
code to be executed per second. This means that our 100 billion bytecode operation
will take about 1,666 seconds, or 28 minutes.
If we have a dual processor, four-core computer, then we might cut the elapsed time
to 25 percent of the original total: 7 minutes. This presumes that we can partition the
work into four (or more) independent OS processes.
The important consideration here is that our budget of 100 billion bytecodes can't be
changed. Parallelism won't magically reduce the workload. It can only change the
schedule to, perhaps, reduce the elapsed time.
Switching to a better algorithm which is On( logn) can reduce the workload to 132 MB
of operations. At 60 MBps, this workload is considerably smaller. Parallelism won't
have the kind of dramatic improvements that algorithm change will have.
Sharing resources with process or threads
The OS assures that there is little or no interaction between processes. For two
processes to interact, some common OS resource must be explicitly shared. This can
be a common file, a specific shared memory object, or a semaphore with a shared
state between the processes. Processes are inherently independent, interaction is
exceptional.