Functions, Iterators, and Generators
We applied the tuple() function to a composite function based on the headsplit
fixed() and row_iter() methods. This will create an object that we can reuse in
several other functions. If we don't materialize a tuple object, only the first sample
will have any data. After that, the source iterator will be exhausted and all other
attempts to access it would yield empty sequestionsnces.
The series() function will pick pairs of items to create the Pair objects. Again, we
applied an overall tuple() function to materialize the resulting tuple-of-namedtuple
sequences so that we can do further processing on each one.
The sample_I sequence looks like the following command snippet:
(Pair(x='10.0', y='8.04'), Pair(x='8.0', y='6.95'),
Pair(x='13.0', y='7.58'), Pair(x='9.0', y='8.81'),
Etc.
Pair(x='5.0', y='5.68'))
The other three sequences are similar in structure. The values, however,
are quite different.
The final thing we'll need to do is create proper numeric values from the strings
that we've accumulated so that we can compute some statistical summary values.
We can apply the float() function conversion as the last step. There are many
alternative places to apply the float() function, and we'll look at some choices
in Chapter 5, Higher-order Functions.
Here is an example describing the usage of float() function:
mean = sum(float(pair.y) for pair in sample_I)/len(sample_I)
This will provide the mean of the y value in each Pair object. We can gather a
number of statistics as follows:
for subset in sample_I, sample_II, sample_III, sample_III:
mean = sum(float(pair.y) for pair in subset)/len(subset)
print(mean)
We computed a mean for the y values in each pair built from the source database.
We created a common tuple-of-namedtuple structure so that we can have reasonably
clear references to members of the source dataset. Using pair.y is a bit less obscure
than pair[1].