Functional Python Programming

(Wang) #1
Chapter 7

The second step will wrap these two-tuples into yet another layer of wrapping. We'll
sort by the x value in the original raw data. The second enumeration will be by the x
value in each pair.


We'll create more deeply nested objects that should look like the following:


((0, (0, Pair(x=4.0, y=4.26))), (1, (2, Pair(x=5.0, y=5.68))), ...,
(10, (9, Pair(x=14.0, y=9.96))))


In principle, we can now compute rank-order correlations between the two variables
by using the x and y rankings. The extraction expression, however, is rather awkward.
For each ranked sample in the data set, r, we have to compare r[0] with r[1][0].


To overcome these awkward references, we can write selector functions as follows:


x_rank = lambda ranked: ranked[0]


y_rank= lambda ranked: ranked[1][0]


raw = lambda ranked: ranked[1][1]


This allows us to compute correlation using x_rank(r) and y_rank(r), making
references to values less awkward.


We've wrapped the original Pair object twice, which created new tuples with the
ranking value. We've avoided stateful class definitions to create complex data
structures incrementally.


Why create deeply nested tuples? The answer is simple: laziness. The processing
required to unpack a tuple and build a new, flat tuple is simply time consuming.
There's less processing involved in wrapping an existing tuple. There are some
compelling reasons for giving up the deeply nested structure.


There are two improvements we'd like to make; they are as follows:


We'd like a flatter data structure. The use of a nested tuple of (x rank, (y rank,
Pair())) doesn't feel expressive or succinct:



  • The enumerate() function doesn't deal properly with ties. If two
    observations have the same value, they should get the same rank. The
    general rule is to average the positions of equal observations. The sequence
    [0.8, 1.2, 1.2, 2.3, 18] should have rank values of 1, 2.5, 2.5, 4.
    The two ties in positions 2 and 3 have the midpoint value of 2.5 as their
    common rank.

Free download pdf