Read The Philosophical Breakfast Club Online
Authors: Laura J. Snyder
But Babbage knew that these statistical methods were not, by themselves, enough to break the most difficult type of cipher, a polyalphabetic substitution cipher, in which the cipher alphabet changes during the encryption. The beauty of this method for the one sending the message is that the cryptanalyst loses the power of determining the frequency of letters: in a polyalphabetic substitution cipher, the first letter in a double pair is encrypted using a different cipher alphabet from the second letter in the pair, so in the cipher text no double appears at all. One polyalphabetic cipher had been known for centuries as the “undecipherable cipher.” Never one to shrink from a seemingly impossible task, Babbage threw himself into the attempt to crack it, like a man possessed.
A
S IN THE
Ninth Bridgewater Treatise
, Babbage was determined to demonstrate the power of mathematics. This time he was not using statistics to uncover the divine origin of the universe, but rather to uncover secrets hidden by a cipher considered unbreakable. And, unlike in that contentious and curmudgeonly work, Babbage was now returning to one of the original goals of the Philosophical Breakfast Club: to use scientific method for the public good, rather than for promoting his own fame or the merits of his engines. Babbage tackled a cipher that had been used by the French during the Napoleonic Wars: the Vigenère. Babbage knew that if the British had only had the means to decipher the tactical messages being sent with this cipher, their victory could have come sooner, with less loss of life and less disruption to trade between the nations. And this cipher was now being used by a new enemy of Britain: Russia.
The cipher—known as
le chiffre indéchiffrable
, the indecipherable cipher—had been invented in 1553 by Giovanni Battista Bellaso, and publicized in 1586 by a young French diplomat named Blaise de Vigenère, by whose name the cipher became known. Vigenère’s cipher was a polyalphabetic substitution cipher utilizing twenty-six cipher alphabets. These are arranged in a “Vigenère square,” a plain-text alphabet followed by the twenty-six different cipher alphabets. These cipher alphabets are each shifted from the previous alphabet by one letter, as shown in the table below:
In the Vigenère cipher, a different line is used to encipher each letter of a plain-text message. So, the first letter of a plain-text three-letter word might be enciphered using row 1 of the Vigenère square, the second letter might be enciphered using row 11, and the third letter enciphered using row 26.
In order to encrypt a message that can be deciphered, there must be an agreed-upon system of switching between rows. This is achieved using a key word. To encrypt a message such as “attack the enemy fortress,” using the key word
bacon
, the first letter would be encrypted using the alphabet that begins with
b
(line 1 of the Vigenère square), the second letter would be encrypted using the alphabet that begins with an
a
(line 26), the third letter would be encrypted using the alphabet that begins with a
c
(line 2), the fourth letter using the alphabet that begins with
o
(line 14),
and the fifth with the alphabet that begins with
n
(line 13), repeating the order of the cipher alphabet on the pattern
baconbaconbaconbacon
. So, in the case of “attack the enemy fortress,” you would have the following:
It is easy to see the challenges for the cryptanalyst. In the plain text there are three occurrences of double letters, while in the cipher text there are none, because of the use of different alphabets. For example, the first double,
tt
, is represented in the cipher text by
tw
. Nor does frequency analysis on the most common letters work, because in the cipher text the most common letters are
t, p
, and
s
, but each of these represents three different letters:
s
, for example, stands for
e, f
, and
s
. The Vigenère seems, indeed, to be undecipherable.
O
N
A
UGUST
10, 1854, mere weeks after Babbage’s testimony in the Childe case, the
Journal of the Society of the Arts
published a letter by John Thwaites, a Bristol dentist, who claimed to have invented a new, unbreakable cipher. “Its uses must be obvious to all,” crowed Thwaites, since there was “not a chance of [its] discovery.” Recognizing right away that Thwaites had “reinvented” the Vigenère cipher, Babbage scolded, in a published letter signed only “C,” that “the cypher in the
Journal
is a very old one, and to be found in most of the books.”
Thwaites, who had applied for a patent for his “new” cipher, responded indignantly. Within the journal’s pages, Thwaites and Babbage faced off. Thwaites issued a challenge: he gave both the plain text and the cipher text, demanding that “C” find the key to the cipher. Babbage set to work, alongside his son Henry Prevost Babbage, who was home on leave from the Indian army.
27
They discovered that Thwaites had doubly encrypted the passage, using two keys successively. Babbage issued a challenge of his own to Thwaites, daring him to find the key used in Babbage’s encryption of the same passage using the same cipher. Thwaites refused further comment. At around this time he abandoned the attempt to patent the cipher; the patent application bears the comment “void by reason of the patentee having neglected to file a specification in pursuance of
the conditions of the letters patent.”
28
Thwaites had apparently been convinced by “C” ’s letter or by someone else that his code was identical to the long-known Vigenère.
In the process of finding the key to Thwaites’s message, Babbage invented a general method for deciphering any text encoded by the Vigenère. This has only recently been discovered by a careful examination of notes scattered throughout the collection of Babbage papers held at the British Library. His notes show pages of equations expressing the mathematical relations between the letters of the plain text, the cipher text, and the key text. All the mathematical relations are spelled out, in very elementary terms, as if Babbage were writing not only for himself but for eventual publication in his deciphering book—or for explaining it to another interested party. But Babbage never published this method in full.
Babbage saw that the code was easy to break once the
length
of the key word was determined, even if the key word itself had not been discovered. Because the key keeps repeating itself, if the periodicity of the key is known, then the cryptanalyst can treat the cipher text as separate occurrences of a simple monoalphabetic code. His method involves looking for sequences of letters that appear more than once in the cipher text. This will always occur when the cipher text is long enough. For example, if the key word is
bacon
, which has five letters, there are only five possible ways that the word
the
can be encoded:
uhg, tjs, vvr, huf, gie
. Since
the
is a very common word, chances are that in a message several sentences long there will be at least one repeated occurrence of one of the five possible ways of encoding it. When a repeated sequence of letters is found, there are two possible explanations. The most likely is that the same sequence of letters in the plain text has been enciphered using the same parts of the key. It is possible, though much less probable, that two different sequences of letters in the plain text have been enciphered using different parts of the key, leading to the same sequence in the cipher text only by coincidence.
To determine the length of the key, the cryptanalyst looks for all repeated sequences of letters, and notes the number of spaces between the occurrences of each. He or she can use that to determine the possible length of the key, which would be a factor of those spaces. If a sequence is repeated after twenty letters, there are six possibilities: (1) the key is one letter long and recycled twenty times (but then the cipher would be monoalphabetic); (2) the key is two letters long and is recycled ten
times in the course of the encryption; (3) the key is four letters long and is recycled five times; (4) the key is five letters long and is recycled four times; (5) the key is ten letters long and is recycled two times; (6) the key is twenty letters long and is encrypted one time.
In a long enough piece of cipher text, there will be more than one repeated sequence of letters. In this case, the cryptanalyst would be able to compare each of the different repeated sequences, in order to find the one possible key length that is shared by all of them. That is, he or she would look for the multiple of the distance between occurrences shared by all repeated sequences. If, for example, another sequence of letters is repeated after thirty letters, that would rule out the possibility that the key is four letters long or twenty letters long, because four and twenty are not multiples of thirty. That leaves open the possibilities that the key is two letters, five letters, or ten letters long. If yet another sequence is repeated after twenty-five letters, that would rule out a key term of two and ten letters, leaving only the possibility of a five-letter key term.
Once the cryptanalyst knows the length of the key, it is possible to break the cipher using frequency analysis. If the key is five letters long, then there are basically five monoalphabetic substitution ciphers at work. (For example, if the key word is
bacon
, there is one monoalphabetic cipher that uses the alphabet beginning with
b
, one that uses the alphabet beginning with
a
, one beginning with
c
, one with
o
, and one with
n
, repeating every five letters.) Grouping every fifth letter together, the analyst has five “messages,” each encrypted using a one-alphabet substitution, and each piece can then be solved using frequency analysis, by looking for the most frequent letters. As these patterns emerge, the cryptanalyst can begin to make guesses about what the key word is, and use this to solve the rest of the message. Once the whole message is deciphered, the cryptanalyst can very easily determine what each letter of the key is, and then apply the key to any future messages that were encoded using it.
29
Babbage was the first to develop this method of deciphering the Vigenère cipher, yet the solution is known as the Kasiski examination, or the Kasiski test, because in 1863 Friedrich W. Kasiski, a retired major in the Prussian army, described this method in his pamphlet
Die Geheimschriften und die Dechiffrir-Kunst
. Although Babbage invented the same method nearly ten years earlier, he never publicized his accomplishment, and he lost the chance to gain fame for it.
30
Why did Babbage keep secret his success in finding a method of
deciphering
le chiffre indéchiffrable
? In the dispute with Thwaites, he never revealed his identity, hiding forever behind the “C” signature in the pages of the
Journal
. What he published in those pages was only a brief description of how he broke Thwaites’s cipher, not the general solution to the Vigenère. This incomplete explanation was not even published in a scientific journal, but in the journal of the Society of the Arts, hardly the platform for a groundbreaking achievement. And although he had been sending numerous letters about deciphering during this time to both Herschel and Augustus De Morgan, both of whom were also intrigued with ciphers and codes, in none of the letters that remain extant today did Babbage inform his friends that he had deciphered the
chiffre indéchiffrable
. Not taking credit for something so impressive is out of character for Babbage, to say the least.