Solving Substitution Ciphers Using Frequency Analysis
2009.06.29
Substitution ciphers, codes in which letters or symbols are used in place of other letters, can be easily cracked using frequency analysis. Frequency analysis is the utilization of how often a letter appears in the code to decrypt it.
To start, you will, of course, need a code to decrypt. In this explanation, this code will serve as an example: ett iyp twiitp aywtmkpr ekp wr iyp terp. Also, you are advised to grab a sheet of scrap paper or two, and a pencil. Using a pen isn't suggested; it is common to make a few mistakes and revisions in the process of solving a cipher. In addition, knowledge of letter frequencies in English, or a reference sheet of such, is required.
First, make a chart of how often each letter appears within the cryptogram. This will allow you to easy keep track of the letters most often used. On average, in the English language, the letter e is most common, followed by t, a, o, n, l, and then r. If possible, replace the letter or character occurring significantly more than the others with e. Next, look for a three letter or character combination ending with an e. If you find this, it is usually safe to conclude that the word is the, especially if the combination is used more than once. Of course, there are other three letter words ending in e, so note that this is only an assumption, and can be revised later if it is incorrect. The is the three letter word used most often, although it is closely followed by the words and, for, are, but, not and all. Now go through and replace the encrypted characters for h and t with the appropriate letter.
At this point, start looking in the cipher for two of the same letter beside one another. When the same letter is repeated it is most likely to be (in descending order) ee, ll, tt, or ss. Once you've found such, think for a moment about which pair of letters to choose. As a model of the required reasoning, return to the example, which would now read (bold letters are deciphered): ett the twttte ahwtmker eke wr the tere. It is clear that the first word, encoded as ett, has a repeated letter. Since a word must contain a vowel, and the letter e has been decrypted already, the word must contain either a, i, o, or u. Because none of these vowels are among the most often used double letters, and are unlikely to be at the end of a word, the first character of the encrypted word must be where the vowel lies. Through trial and error, we can establish that all is a suitable translation of the code, because not only does it contain common letters, it is also one of the most often used three letter words. Thus, the placeholder e must represent the letter a and the encrypted t must correspond with the decrypted letter l. Again, insert the newly found letters into the cipher. Now the example shows: all the lwttle ahwlmker ake wr the lare.
Faced with the partially decrypted character arrangement ake, the list of most common three letter words must again be observed. When using reference lists to decrypt words, also look at letter frequencies. The letter r has a high frequency, as does the word are. Therefore, are is a fitting translation. The partially decoded text lwttle leaves minimal room for guesswork, the word must be little. The human mind is your greatest tool in breaking codes, because it can recognize likely words with minimal effort.
There is also a reference list of common two letter words that must be used when solving codes with frequency analysis. In descending order, of, to, in, it, and is are the two letter words used most consistently. Using this list, one can guesstimate that the combination ir is probably in; some of the other words seem likely, but when they are inserted into the cipher they do not logically solve it. Lastly, if you are not sure about a two or three letter word it is alright to move on and decipher it through context clues.
The example is now at this point: all the little ahilmren are in the lane. Using logic, the text ahilmren can be replaced with children. Often when a word has one to three letters missing it is best to guess sensibly and then work until the guess is proven correct or incorrect.
Through all this logic, guesswork, and thorough analysis, the majority of the cipher should be completed. Finally, the all required basics of frequency analysis in code breaking have been used; the fully decoded text can now be read: all the little children are in the lane.
Epistle
2009.05.11
Dear Lucie,
Firstly, I hope all is well with you, Lucie, but, if you would excuse my haste, I have something of importance to express. Not a day ago, the desperate rebels of Saint-Antoine, a mere kilometer from my present residence, broke out in chaos unparalleled by the previous unrest within of the district. The revolutionaries stormed the Bastille in a massive cluster, fiercely and savagely craving noble blood. Painting the dusty dismal streets with that vital fluid, they slew the officers and guards of the prison, as well as the governor himself. Knowing you as I believe I do, I am certain you recognize the significance of this in relation to the rueful doctor, whom you love so greatly. It worries me to consider, that upon hearing of a disturbance so close in proximity to his former confines, the likelihood of his returning to a frenzied state for a period of time. Aware of the removal and destruction of his single crutch, the shoemaker's bench, I am left uneasily question what his reaction will be to the situation. Although he has made stellar progress since his confinement, I advise you to divulge this information to him gradually, yet additionally, before any other. Lucie, you bind all of us together as friends and family, but foremost, you are the one person, the one thing, the one devotion that has recalled the good doctor to life after many harrowing years. It is you who must address this issue, for it is you he trusts and loves. I do hope he accepts the news well, for after a time I believe it will do him good to know it, regardless if he comes to understand this in a week's time, or a decades.
I apologize, as this correspondence is hurried and grave. I long for the time beyond the present, when the revolution has passed and tranquility is restore in the French lands, then I may be able to write a letter of leisure and friendship.

Your truly concerned friend,







Jacques
