Chapter 15
with open(""Anscombe.txt"") as source:
data = tuple(head_map_filter(row_iter(source)))
mapping = dict((id_str, tuple(series(id_num,data)))
for id_num, id_str in enumerate(['I', 'II', 'III', 'IV'])
)
return mapping
We opened the local data file, and applied a simple row_iter() function to
return each line of the file parsed into a row of separate files. We applied the
head_map_filter() function to remove the heading from the file. The result
created a tuple-of-tuple structure with all of the data.
We transformed the tuple-of-tuple into a more useful dict() function by
selecting particular series from the source data. Each series will be a pair of
columns. For series "I," it's columns 0 and 1. For series "II," it's columns 2 and 3.
We used the dict() function with a generator expression for consistency with the
list() and tuple() functions. While it's not essential, it's sometimes helpful to see
the similarities with these three data structures and their use of generator expressions.
The series() function creates the individual Pair objects for each x,y pair in the
dataset. In retrospect, we can see the the output value after modifying this function
so that the resulting namedtuple class is an argument to this function, not an implicit
feature of the function. We'd prefer to see the series(id_num,Pair,data) method
to see where the Pair objects are created. This extension requires rewriting some of
the examples in Chapter 3, Functions, Iterators, and Generators. We'll leave that as an
exercise for the reader.
The important change here is that we're showing the formal doctest test case. As
we noted earlier, web applications—as a whole—are difficult to test. The web server
must be started and then a web client must be used to run the test cases. Problems
then have to be resolved by reading the web log, which can be difficult unless
complete tracebacks are displayed. It's much better to debug as much of the web
application as possible using ordinary doctest and unittest testing techniques.
Applying a filter
In this application, we're using a very simple filter. The entire filter process is
embodied in the following function:
def anscombe_filter(set_id, raw_data):
""""""
anscombe_filter(""II"", raw_data()) #doctest: +ELLIPSIS