The Functools Module
We've used the operator.add method to sum our values instead of the longer
lambda form.
Following is how we can count values in an iterable:
def count_mr(iterable):
return map_reduce(lambda y: 1, operator.add, iterable)
We've used the lambda y: 1 parameter to map each value to a simple 1. The count
is then a reduce() function using the operator.add method.
The general-purpose reduce() function allows us to create any species of reduction
from a large dataset to a single value. There are some limitations, however, on what
we should do with the reduce() function.
We should avoid executing commands such as the following:
reduce(operator.add, ["1", ",", "2", ",", "3"], "")
Yes, it works. However, the "".join(["1", ",", "2", ",", "3"]) method
is considerably more efficient. We measured 0.23 seconds per million to do the
"".join() function versus 0.69 seconds to do the reduce() function.
Using reduce() and partial()
The sum() function can be seen as the partial(reduce,
operator.add) method. This, too, gives us a hint as to how we can
create other mappings and other reductions. We can, indeed, define all
of the commonly used reductions as partials instead of lambdas.
Following are two examples:
sum2= partial(reduce, lambda x,y: x+y**2)
count= partial(reduce, lambda x,y: x+1)
We can now use these functions via the sum2(somedata) or the count(some
iter) method. As we noted previously, it's not clear how much benefit this has.
It's possible that a particularly complex calculation can be explained simply with
functions like this.