Chapter 4
One common application of for loop iterable processing is the
unwrap(process(wrap(iterable))) design pattern. A wrap() function will first
transform each item of an iterable into a two tuples with a derived sort key or other
value and then the original immutable item. We can then process these two tuples
based on the wrapped value. Finally, we'll use an unwrap() function to discard the
value used to wrap, which recovers the original item.
This happens so often in a functional context that we have two functions that are
used heavily for this; they are as follows:
fst = lambda x: x[0]
snd = lambda x: x[1]
These two functions pick the first and second values from a tuple, and both are
handy for the process() and unwrap() functions.
Another common pattern is wrap(wrap(wrap())). In this case, we're starting
with simple tuples and then wrapping them with additional results to build
up larger and more complex tuples. A common variation on this theme is
extend(extend(extend())) where the additional values build new, more
complex namedtuple instances without actually wrapping the original tuples.
We can summarize both of these as the Accretion design pattern.
We'll apply the Accretion design to work with a simple sequence of latitude and
longitude values. The first step will convert the simple points (lat, lon) on a path
into pairs of legs (begin, end). Each pair in the result will be ((lat, lon), (lat, lon)).
In the next sections, we'll show how to create a generator function that will iterate over
the content of a file. This iterable will contain the raw input data that we will process.
Once we have the data, later sections will show how to decorate each leg with the
haversine distance along the leg. The final result of the wrap(wrap(iterable())))
processing will be a sequence of three tuples: ((lat, lon), (lat, lon), distance). We
can then analyze the results for the longest, shortest distance, bounding rectangle,
and other summaries of the data.
Parsing an XML file
We'll start by parsing an XML (short for Extensible Markup Language) file to get
the raw latitude and longitude pairs. This will show how we can encapsulate some
not-quite functional features of Python to create an iterable sequence of values.
We'll make use of the xml.etree module. After parsing, the resulting ElementTree
object has a findall() method that will iterate through the available values.