Functional Python Programming

(Wang) #1

Higher-order Functions


We often want to use the filter() function with defined functions instead of
lambda forms. The following is an example of reusing a predicate defined earlier:





from ch01_ex1 import isprimeg








list(filter(isprimeg, range(100)))





[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61,
67, 71, 73, 79, 83, 89, 97]


In this example, we imported a function from another module called isprimeg().
We then applied this function to a collection of values to pass the prime numbers
and reject any non-prime numbers from the collection.


This can be a remarkably inefficient way to generate a table of prime numbers. The
superficial simplicity of this is the kind of thing lawyers call an attractive nuisance.
It looks like it might be fun, but it doesn't scale well at all. A better algorithm is the
Sieve of Eratosthenes; this algorithm retains the previously located prime numbers
and uses them to prevent a lot of inefficient recalculation.


Using filter() to identify outliers


In the previous chapter, we defined some useful statistical functions to compute
mean and standard deviation and normalize a value. We can use these functions to
locate outliers in our trip data. What we can do is apply the mean() and stdev()
functions to the distance value in each leg of a trip to get the population mean and
standard deviation.


We can then use the z() function to compute a normalized value for each leg. If the
normalized value is more than 3, the data is extremely far from the mean. If we reject
this outliers, we have a more uniform set of data that's less likely to harbor reporting
or measurement errors.


The following is how we can tackle this:


from stats import mean, stdev, z
dist_data = list(map(dist, trip))
μ_d = mean(dist_data)
σ_d = stdev(dist_data)
outlier = lambda leg: z(dist(leg),μ_d,σ_d) > 3
print("Outliers", list(filter(outlier, trip)))


We've mapped the distance function to each leg in the trip collection. As we'll do
several things with the result, we must materialize a list object. We can't rely on
the iterator as the first function will consume it. We can then use this extraction to
compute population statistics μ_d and σ_d with the mean and standard deviation.

Free download pdf