Functional Python Programming

(Wang) #1
Chapter 16

We can assign this single source to the variable defects. The value looks like this:


Counter({('3', 'C'): 49, ('1', 'C'): 45, ('2', 'C'): 34,
('3', 'A'): 33, ('2', 'B'): 31, ('2', 'A'): 26,('1', 'B'): 21,
('3', 'D'): 20, ('3', 'B'): 17, ('1', 'A'): 15, ('1', 'D'): 13,
('2', 'D'): 5})


This matches the detail summary shown previously. The source data, however, was
already summarized. This is often the case when data is extracted from a database
and SQL is used to do group-by operations.


Computing probabilities from a Counter object


We need to compute the probabilities of defects by shift and defects by type.
In order to compute the expected probabilities, we need to start with some simple
sums. The first is the overall sum of all defects, which can be calculated by executing
the following command:


total= sum(defects.values())


This is done directly from the values in the Counter object assigned to the defects
variable. This will show that there are 309 total defects in the sample set.


We need to get defects by shift as well as defects by type. This means that we'll
extract two kinds of subsets from the raw defect data. The "by-shift" extract will use
just one part of the (shift,defect type) key in the Counter object. The "by-type"
will use the other half of the key pair.


We can summarize by creating additional Counter objects extracted from the
initial set of the Counter objects assigned to the defects variable. Here's the
by-shift summary:


shift_totals= sum((Counter({s:defects[s,d]}) for s,d in defects),
Counter())


We've created a collection of individual Counter objects that have a shift, s, as the
key and the count of defects associated with that shift defects[s,d]. The generator
expression will create 12 such Counter objects to extract data for all combinations of
four defect types and three shifts. We'll combine the Counter objects with a sum()
function to get three summaries organized by shift.

Free download pdf