Chapter 20 – Frequency Analysis 301
Figure 20-2. Letter frequency of normal English, sorted.
The word “ETAOIN” is a handy way to remember the six most frequent letters. The full list of
letters ordered by frequency is “ETAOINSHRDLCUMWFGYPBVKJXQZ”.
Think about the transposition cipher: Messages encrypted with the transposition cipher contain all
the original letters of the original English plaintext, except in a different order. But the frequency
of each letter in the ciphertext remains the same: E, T, and A should occur much more often than
Q and Z. Because they are the same letters, the frequencies of these letters in the ciphertext are
the same as the plaintext.
The Caesar and simple substitution ciphers have their letters replaced, but you can still count the
frequency of the letters. The letters may be different but the frequencies are the same. There
should be letters that occur the most often in the ciphertext. These letters are good candidates for
being cipherletters for the E, T, or A letters. The letters in the ciphertext that occur least are more
likely to be X, Q, and Z.
This counting of letters and how frequently they appear in both plaintexts and ciphertexts is
called frequency analysis.
Since the Vigenère cipher is essentially multiple Caesar cipher keys used in the same message,
we can use frequency analysis to hack each subkey one at a time based on the letter frequency of
the attempted decryptions. We can’t use English word detection, since any word in the ciphertext
will have been encrypted with multiple subkeys. But we don’t need full words, we can analyze