Chapter 21 – Hacking the Vigenère Cipher 321
The readlines() File Object Method
- words = fo.readlines()
File objects returned from open() have a readlines() method. Unlike the read() method
which returns the full contents of the file as a single string, the readlines() method will
return a list of strings, where each string is a single line from the file. Note that each of the strings
in the list will end with a \n newline character (except for possibly the very last string, since the
file might not have ended with a newline).
The source code for this program isn’t anything we haven’t seen in previous hacking programs in
this book, aside from the new readlines() method. The hackVigenere() function reads
in the contents of the dictionary file, uses each word in that file to decrypt the ciphertext, and if
the decrypted text looks like readable English it will prompt the user to quit or continue.
As such, we won’t do a line-by-line explanation for this program, and instead continue on with a
program that can hack the Vigenère cipher even when the key was not a word that can be found
in the dictionary.
The Babbage Attack & Kasiski Examination
Charles Babbage is known to have broken the Vigenère cipher, but he never published his results.
Later studies revealed he used a method that was later published by early 20th-century
mathematician Friedrich Kasiski.
“Kasiski Examination” is a process used to determine how long the Vigenère key used to encrypt
a ciphertext was. After this is determined, frequency analysis can be used to break each of the
subkeys.
Kasiski Examination, Step 1 – Find Repeat Sequences’ Spacings
The first part of Kasiski Examination is to find every repeated set of letters at least three letters
long in the ciphertext. These are significant, because they could indicate that they were the same
letters of plaintext encrypted with the same subkeys of the key. For example, if the ciphertext is
“Ppqca xqvekg ybnkmazu ybngbal jon i tszm jyim. Vrag voht vrau c tksg. Ddwuo xitlazu vavv
raz c vkb qp iwpou.” and we remove the non-letters, the ciphertext looks like this:
PPQCAXQVEKGYBNKMAZUYBNGBALJONITSZMJYIMVRAGVOHTVRAUCTKSGDDWUOXITLA
ZUVAVVRAZCVKBQPIWPOU
You can see that the sequences VRA, AZU, and YBN repeat in this ciphertext: