For instance, given a set of engineers and a set of writers, you can pick out individuals
who do both activities by intersecting the two sets. A union of such sets would contain
either type of individual, but would include any given individual only once. This latter
property also makes sets ideal for removing duplicates from collections—simply con-
vert to and from a set to filter out repeats.
In fact, we relied on such operations in earlier chapters; PyMailGUI in Chapter 14, for
example, used intersection, union, and difference to manage the set of active mail
downloads, and filtered out duplicate recipients in multiple contexts with set conver-
sion. Sets are a widely relevant tool on practical programs.
Built-in Options
If you’ve studied the core Python language, you should already know that, as for stacks,
Python comes with built-in support here as well. Here, though, the support is even
more direct—Python’s set datatype provides standard and optimized set operations
today. As a quick review, built-in set usage is straightforward: set objects are initially
created by calling the type name with an iterable or sequence giving the components
of the set or by running a set comprehension expression:
>>> x = set('abcde') # make set from an iterable/sequence
>>> y = {c for c in 'bdxyz'} # same via set comprehension expression
>>> x
{'a', 'c', 'b', 'e', 'd'}
>>> y
{'y', 'x', 'b', 'd', 'z'}
Once you have a set, all the usual operations are available; here are the most common:
>>> 'e' in x # membership
True
>>> x – y # difference
{'a', 'c', 'e'}
>>> x & y # intersection
{'b', 'd'}
>>> x | y # union
{'a', 'c', 'b', 'e', 'd', 'y', 'x', 'z'}
Interestingly, just like the dictionaries, built-in sets are unordered, and require that all
set components be hashable (immutable). Making a set with a dictionary of items
works, but only because set uses the dictionary iterator, which returns the next key on
each iteration (it ignores key values):
>>> x = set(['spam', 'ham', 'eggs']) # sequence of immutables
>>> x
{'eggs', 'ham', 'spam'}
>>> x = {'spam', 'ham', 'eggs'} # same but set literal if items known
>>> x
{'eggs', 'ham', 'spam'}
>>> x = set([['spam', 'ham'], ['eggs']]) # immutables do not work as items
TypeError: unhashable type: 'list'
1374 | Chapter 18: Data Structures