Functional Python Programming

(Wang) #1
Chapter 6

A common operation that can be approached either as a stateful map or as a
materialized, sorted object is computing the mode of a set of data values. When we
look at our trip data, the variables are all continuous. To compute a mode, we'll need
to quantize the distances covered. This is also called binning: we'll group the data
into different bins. Binning is common in data visualization applications, also. In this
case, we'll use 5 nautical miles as the size of each bin.


The quantized distances can be produced with a generator expression:


quantized= (5*(dist//5) for start,stop,dist in trip)


This will divide each distance by 5 – discarding any fractions – and then multiply by
5 to compute a number that represents the distance rounded down to the nearest 5
nautical miles.


Building a mapping with Counter


A mapping like the collections.Counter method is a great optimization for doing
reductions that create counts (or totals) grouped by some value in the collection.
A more typical functional programming solution to grouping data is to sort the
original collection, and then use a recursive loop to identify when each group begins.
This involves materializing the raw data, performing a On( logn) sort, and then
doing a reduction to get the sums or counts for each key.


We'll use the following generator to create an simple sequence of distances
transformed into bins:


quantized= (5*(dist//5) for start,stop,dist in trip)


We divided each distance by 5 using truncated integer division, and then multiplied
by 5 to create a value that's rounded down to the nearest 5 miles.


The following expression creates a mapping from distance to frequency:


from collections import Counter


Counter(quantized)


This is a stateful object, that was created by – technically – imperative object-oriented
programming. Since it looks like a function, however, it seems a good fit for a design
based on functional programming ideas.

Free download pdf