Hacking Secret Ciphers with Python

(Ann) #1
Chapter 18 – Hacking the Simple Substitution Cipher 251

Computing Word Patterns


There are too many possible keys to brute-force a simple substitution cipher-encrypted message.
We need to employ a more intelligent attack if we want to crack a substitution ciphertext. Let’s
examine one possible word from an example ciphertext:


HGHHU

Think about what we can learn from this one word of ciphertext (which we will call a
cipherword in this book). We can tell that whatever the original plaintext word is, it must:



  1. Be five letters long.

  2. Have the first, third, and fourth letters be the same.

  3. Have exactly three different letters in the word, where the first, second, and fifth letters in
    the word are all different from each other.


What words in the English language fit this pattern? “Puppy” is one word that fits this pattern. It
is five letters long (P, U, P, P, Y) using three different letters (P, U, Y) in that same pattern (P for
the first, third, and fourth letter and U for the second letter and Y for the fifth letter). “Mommy”,
“Bobby”, “lulls”, “nanny”, and “lilly” fit the pattern too. (“Lilly” is a name, not to be confused
with “Lily” the flower. But since “Lilly” can appear in an Engish message it is a possible word
that fits the pattern.) If we had a lot of time on our hands, we could go through an entire
dictionary and find all the words that fit this pattern. Even better, we could have a computer go
through a dictionary file for us.


In this book a word pattern will be a set of numbers with periods in between the numbers that
tells us the pattern of letters for a word, in either ciphertext or plaintext.


Creating word patterns for cipherwords is easy: the first letter gets the number 0 and the first
occurrence of each different letter after that gets the next number. For example:


 The word pattern for “cat” is 0.1.2.
 The word pattern for “catty” is 0.1.2.2.3.
 The word pattern for “roofer” is 0.1.1.2.3.0.
 The word pattern for “blimp” is 0.1.2.3.4.
 The word pattern for “classification” is 0.1.2.3.3.4.5.4.0.2.6.4.7.8.


A plaintext word and its cipherword will always have the same word pattern, no matter
which simple substitution key was used to do the encryption.

Free download pdf