Functional Python Programming

(Wang) #1
Chapter 12

This version has the advantage of being slightly easier to expand when we add new
filter criteria.


The use of generator functions (such as the filter() function)
means that we aren't creating large intermediate objects. Each of the
intermediate variables, ne, nx_name, and nx_ext, are proper lazy
generator functions; no processing is done until the data is consumed
by a client process.

While elegant, this suffers from a small inefficiency because each function will need
to parse the path in the AccessDetails object. In order to make this more efficient,
we will need to wrap a path.split('/') function with the lru_cache attribute.


Analyzing the access details


We'll look at two analysis functions we can use to filter and analyze the individual
AccessDetails objects. The first function, a filter() function, will pass only specific
paths. The second function will summarize the occurrences of each distinct path.


We'll define the filter() function as a small function and combine this with the
built-in filter() function to apply the function to the details. Here is the composite
filter() function:


def book_filter(access_details_iter):


def book_in_path(detail):


path = tuple(l for l in detail.url.path.split('/') if l)


return path[0] == 'book' and len(path) > 1


return filter(book_in_path, access_details_iter)


We've defined a rule, the book_in_path() attribute, that we'll apply to each
AccessDetails object. If the path is not empty and the first-level attribute of the
path is book, then we're interested in these objects. All other AccessDetails objects
can be quietly rejected.


Here is the final reduction that we're interested in:


from collections import Counter


def reduce_book_total(access_details_iter):


counts= Counter()


for detail in access_details_iter:


counts[detail.url.path] += 1


return counts

Free download pdf