Hacking Secret Ciphers with Python

(Ann) #1
Chapter 12 – Detecting English Programmatically 175

Finding Items is Faster with Dictionaries Than Lists


detectEnglish.py


  1. englishWords = {}


In the loadDictionary() function, we will store all the words in the “dictionary file” (as in,
a file that has all the words in an English dictionary book) in a dictionary value (as in, the Python
data type.) The similar names are unfortunate, but they are two completely different things.


We could have also used a list to store the string values of each word from the dictionary file. The
reason we use a dictionary is because the in operator works faster on dictionaries than lists.
Imagine that we had the following list and dictionary values:





listVal = ['spam', 'eggs', 'bacon']
dictionaryVal = {'spam':0, 'eggs':0, 'bacon':0}





Python can evaluate the expression 'bacon' in dictionaryVal a little bit faster than
'bacon' in listVal. The reason is technical and you don’t need to know it for the
purposes of this book (but you can read more about it at http://invpy.com/listvsdict)..) This faster
speed doesn’t make that much of a difference for lists and dictionaries with only a few items in
them like in the above example. But our detectEnglish module will have tens of thousands
of items, and the expression word in ENGLISH_WORDS will be evaluated many times when
the isEnglish() function is called. The speed difference really adds up for the
detectEnglish module.


The split() Method


The split() string method returns a list of several strings. The “split” between each string
occurs wherever a space is. For an example of how the split() string method works, try typing
this into the shell:





'My very energetic mother just served us Nutella.'.split()
['My', 'very', 'energetic', 'mother', 'just', 'served', 'us', 'Nutella.']





The result is a list of eight strings, one string for each of the words in the original string. The
spaces are dropped from the items in the list (even if there is more than one space). You can pass
an optional argument to the split() method to tell it to split on a different string other than just
a space. Try typing the following into the interactive shell:





'helloXXXworldXXXhowXXXareXXyou?'.split('XXX')




Free download pdf