Functional Python Programming

(Wang) #1
Chapter 16

Optimizing storage


There's no general rule for optimization. We often focus on optimizing performance
because we have tools like the Big O measure of complexity that show us whether
or not an algorithm is an effective solution to a given problem. Optimizing storage is
usually tackled separately: we can look at the steps in an algorithm and estimate the
size of the storage required for the various storage structures.


In many cases, the two considerations are opposed. In some cases, an algorithm that
has outstandingly good performance requires a large data structure. This algorithm
can't scale without dramatic increases in the amount of storage required. Our goal is
to design an algorithm that is reasonably fast and also uses an acceptable amount
of storage.


We may have to spend time researching algorithmic alternatives to locate
a way to make the space-time trade off properly. There are some common
optimization techniques. We can often follow links from Wikipedia:
http://en.wikipedia.org/wiki/Space–time_tradeoff.


One memory optimization technique we have in Python is to use an iterable.
This has some properties of a proper materialized collection, but doesn't necessarily
occupy storage. There are few operations (such as the len() function) that can't
work on an iterable. For other operations, the memory saving feature can allow a
program to work with very large collections.


Optimizing accuracy


In a few cases, we need to optimize the accuracy of a calculation. This can be
challenging and may require some fairly advanced math to determine the limits
on the accuracy of a given approach.


An interesting thing we can do in Python is replace floating point approximations
with fractions.Fraction value. For some applications, this can create more
accurate answers than floating point, because more bits are used for numerator
and denominator than a floating point mantissa.


It's important to use decimal.Decimal values to work with currency. It's a common
error to use a float value. When using a float value, additional noise bits are
introduced because of the mismatch between Decimal values provided as input
and the binary approximation used by floating point values. Using Decimal values
prevents the introduction of tiny inaccuracies.

Free download pdf