Functional Python Programming

(Wang) #1

The Itertools Module


One application of running totals is quartiling data. We can compute the running
total for each sample and divide them into quarters with an int(4*value/total)
calculation.


In the Assigning numbers with enumerate() section, we introduced a sequence of
latitude-longitude coordinates that describe a sequence of legs on a voyage. We can
use the distances as a basis for quartiling the waypoints. This allows us to determine
the midpoint in the trip.


The value of the trip variable looks as follows:


(Leg(start=Point(latitude=37.54901619777347, longitude=
-76.33029518659048), end=Point(latitude=37.840832, longitude=
-76.273834), distance=17.7246),
Leg(start=Point(latitude=37.840832, longitude=-76.273834),
end=Point(latitude=38.331501, longitude=-76.459503),
distance=30.7382), ...,
Leg(start=Point(latitude=38.330166, longitude=-76.458504),
end=Point(latitude=38.976334, longitude=-76.473503),
distance=38.8019))


Each Leg object has a start point, an end point, and a distance. The calculation of
quartiles looks like the following example:


distances= (leg.distance for leg in trip)


distance_accum= tuple(accumulate(distances))


total= distance_accum[-1]+1.0


quartiles= tuple(int(4*d/total) for d in distance_accum)


We extracted the distance values and computed the accumulated distances for each
leg. The last of the accumulated distances is the total. We've added 1.0 to the total
to assure that 4*d/total is 3.9983, which truncates to 3. Without the +1.0, the final
item would have a value of 4 , which is an impossible fifth quartile. For some kinds of
data (with extremely large values) we might have to add a larger value.


The value of the quartiles variable is as follows:


(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3)


We can use the zip() function to merge this sequence of quartile numbers with
the original data points. We can also use functions like groupby() to create distinct
collections of the legs in each quartile.

Free download pdf