No, the Mysterious Voynich Manuscript Is Not Written in Hebrew

Despite the silly claims of two computer scientists.

February 7, 2018 | Philologos

A page from the Voynich Manuscript. Wikipedia.

Got a question for Philologos? Ask him directly at

If you’re like me, you would have replied two weeks ago to the question “What do you think of the Voynich manuscript?” with the answer “The Voy which manuscript?” I had never heard either of the Polish rare-book dealer Wilfrid Voynich (1865-1930) or of the mysterious manuscript purchased by him in 1912 from the library of the Collegio Romano in Rome; much less did I know that the undeciphered code it is written in has been a famous puzzle of modern cryptography and that Bradley Hauer and Grzegorz Kondrak, two computer scientists from the University of Alberta, announced in a 2016 paper that its encoded language is Hebrew. All of this I’ve learned from the media, which recently came across Hauer and Kondrak’s paper, published in Transactions of the Association for Computational Linguistics, and turned it into a news item.

The parchment of the 240-page manuscript acquired by Voynich, so I’ve also learned, has been dated by radiocarbon testing to sometime between 1404 and 1438. The manuscript’s text, which reads from left to right, is in an unfamiliar alphabet and accompanied by numerous illustrations—some of different plants and herbal preparations, some of astronomical charts, and some, most intriguingly, of nude or semi-nude female figures bathing in pools or tubs and linked to each other by pipe-like conduits.

All of this suggests a work on medieval medicine, astrology, and possibly alchemy whose exact nature cannot be determined and whose author is as unknown as was his motive for encoding it. Was it to protect secrets of which he believed himself to be the sole possessor? To experiment, or amuse himself, with the invention of a secret alphabet and perhaps a secret language? To create an esoteric document that could be sold to the highest bidder as a rare treasure of arcane knowledge? For hundreds of years, this, too, has been anyone’s guess.

And guesses have not been lacking. The earliest on record is attributed to the Holy Roman Emperor Rudolph II (1576-1612), who ruled his kingdom from Prague. The manuscript’s owner in the mid-17th century was Jan Marek Marci, the rector of Prague’s Charles University. Marci was informed by Raphael Mnishovsky, the Czech-language tutor of Rudolph’s successor Ferdinand III, that Rudolph had once owned the manuscript and believed its author to be the 13th-century English philosopher, linguist, and alchemist Roger Bacon. This was related by Marci in a letter written in 1665 to the renowned scholar and linguist Anastasius Kircher (1602-1680), whose help he requested in deciphering the manuscript’s text. Whether Kircher, who knew some Hebrew and had spent years trying to decode Egyptian hieroglyphics, complied with Marci’s request is unknown. He did not in any case offer a solution.

Roger Bacon lived over a century too soon to have written the Voynich manuscript, of whose ownership prior to Rudolph nothing is known. Its many proposed authors have also included the English mathematician and astrologer John Dee (1527-1608), who allegedly sold it to Rudolph as a work of Bacon’s; Dee’s associate, the spiritualist Edward Kelley; the Italian engineer, inventor, and cryptographer Giovanni Fontana (1395-1455); the Italian architect Antonio Avelino (1400-1469); the English herbalist and astrologer Anthony Ascham (1517-1559); Mnishovsky himself; and still others. Among the world’s languages held to have been encoded in the document have been Latin, Italian, Flemish, Ukrainian, Hebrew, Arabic, Chinese, and Tibetan—and also none at all, one theory being that it was written in either an imaginary tongue or pure gibberish.


Now, along have come Hauer and Kondrak and sought, on the basis of computer algorithms, to make the case for Hebrew. I can’t say I have understood everything in their paper, “Decoding Anagrammed Texts Written in an Unknown Language and Script,” but the gist of it is not too difficult to follow. Three basic assumptions underlie it. The first is that the language encrypted in the Voynich manuscript is a real one, to the letters of whose alphabet the manuscript’s characters correspond. The second is that these characters may possibly be limited to consonants, as are those of languages like Hebrew and Arabic. And the third is that the words written in them may sometimes be anagrams, as is the case with some codes. (An anagram is a word whose letters have been scrambled as in, say, “balthape” in place of “alphabet.”)

Guided by these assumptions, Hauer and Kondrak took the text of the UN’s Universal Declaration of Human Rights in 380 different languages. They then computed it for such things as the frequency of different letters, words, and lengths of words; the statistical likelihood of certain letters being found in given positions within words; the probability of such letters being preceded or followed by these letters; and the possibility of extracting, by “trial decipherment,” coherent utterances from them. Finally, these computations were matched against similar ones made for the Voynich manuscript in order to determine which of them correlated with it best.

The results were, to say the least, curious. When using the letter-frequency method, the language Hauer and Kondrak found most similar to the Voynich manuscript’s was Mazatec, a member of the Popolocan family of southern Mexico; Hebrew was not a top contender, although Mozarabic (the mix of Arabic and Romance spoken by Christians in medieval Spain), Italian, and Ladino all were. When, on the other hand, Hauer and Kondrak resorted to the positional method—called by them “the decomposition pattern”—Hebrew beat all 379 other languages handily. And it topped the list again, together with Esperanto, in trial decipherment. Hauer and Kondrak were even able, so they claimed, “after a couple of spelling corrections,” to find a meaningful Hebrew statement from the manuscript’s opening sentence. This was ועשה לה הכהן איש אליו לביתו ועלי אנשיו המצות, rendered for them into English by Google Translate as “She made recommendations to the priest, man of the house, and me and people.”


At this point, it must be said, Hauer and Kondrak’s paper descends into silliness. Quite apart from the unlikelihood of even the most esoteric of manuscripts beginning in such a manner, one can only compliment Google Translate on its ingenuity. She made recommendations to the priest, man of the house, and me and people? Even after “spelling corrections,” the Hebrew words in question mean no such thing. In fact, they mean nothing at all. Translating them without Google’s finessing, one comes up with something like “And he made her the priest each man to himself to his house and on me his people the commandments.” If this was the winning entry in the trial-decipherment round of competition, one can only imagine its rivals.

Moreover, it is hardly surprising that Hebrew should have yielded more actual words than other languages. By designing their algorithms for vowellessness and anagrams, Hauer and Kondrak tilted the playing field in favor of Semitic languages. If one takes, for example, a three-letter English word like “cat,” one can derive only one other word, “act,” by scrambling it—and many three-letter English words yield nothing at all when rearranged. But with a three-letter Hebrew word, the results will be quite different. By adding different vowels to the consonants yod-lamed-daled, for example, one can get yeled, a male child or boy; yalad, “he gave birth”; yeyleyd, “he will give birth”; l’yad, “next to”; and d’li, a bucket. This gives Hebrew a significant competitive advantage, because it will present the cryptographer with many more words to work with, especially if he is willing to take liberties with them like Google Translate.


This is not to say that Hebrew is an entirely illogical choice for the Voynich manuscript’s encrypted language. Jews throughout the Middle Ages had a keen interest in medicine and astrology, as well as, to a lesser extent, in alchemy, and in the 15th century, with the discovery of the Kabbalah by Christian Hebraists in Renaissance Italy, Hebrew acquired a reputation as a repository of secret wisdom. It is not inconceivable, therefore, that a Jew or Christian Hebraist resorted to encrypted Hebrew in composing a lengthy work on such subjects, whether to keep it from others or, on the contrary, to whet an appetite for reading it that could be satisfied only at a high price.

And yet, though Renaissance Italy is indeed the most probable locale for the Voinich manuscript’s composition, the manuscript is far more likely to have been written, if not in an invented language of its own, then in an encrypted Latin, Italian, or other European tongue. This is especially so because its radiocarbon dating, if accurate, places it a half-century too early to make Hebrew a leading candidate.

It was only in the 1480s, with the Hebraist activities of Pico della Mirandola (1463-1494) and his circle, that Hebrew began to gain recognition among educated Italians, and there is little reason to believe that either a Jew or a Christian would have encoded it in a manuscript before then. No Christian would have done so because no Christian knowing even a small amount of Hebrew seems to have existed before Pico. (There were not many of them in his age, either.) No Jew would have done so because the manuscript, even if authored by a Jew, shows signs of Renaissance influence (in, for example, its nude bathing women) that had not penetrated the Italian Jewish community at an earlier date.

The Voynich manuscript remains a mystery. About all that can definitely be said about it is that it is not written in Esperanto.

Got a question for Philologos? Ask him directly at