Do All Human Languages Derive From A Single Source?

Some paleolinguists have floated the idea of an original human language they call “Proto-Sapiens.” Is that what our ancestors were speaking when they built the Tower of Babel?

October 21, 2020 | Philologos
About the author: Philologos, the renowned Jewish-language columnist, appears twice a month in Mosaic. Questions for him may be sent to his email address by clicking here.

A diorama showing ancient cavemen inside the National Museum of Mongolian History in Ulaanbaatar, Mongolia. Wikimedia.

Got a question for Philologos? Ask him directly at [email protected].

“And all the earth was of one language and one speech. . . . And they said, ‘Come, let us build us a city and a tower whose top may reach unto heaven.’. . . And the Lord said, ‘Behold the people is one and this they begin to do; and now nothing will be restrained from them, which they have imagined to do. Come, let us go down and confound their language, that they may not understand one another’s speech.’”

Thus we read in this week’s Torah portion, the parashah of Noah. Humanity speaks many different languages, we are told by it, because God feared that if it didn’t it would pool its resources to reach the heavens and, presumably, contest His rule.

“Many different languages” is an understatement. There are an estimated 6,500 of them in the world today (although quite a few are in danger of extinction), and if we take into account the past existence of numerous no-longer-extant languages, it is probable that many tens of thousands have been spoken in human history. Yet was there ever really “one language and one speech” from which they all came? Or, to rephrase the question: is human speech monogenetic, having its source, as the Bible relates, in a single language, the original one spoken by homo sapiens, or is it polygenetic and descended from more than one, and perhaps many, primeval languages that evolved independently in different times and places in the early history of mankind?

Given that current arguments among linguists, prehistorians, paleontologists, and cognitive scientists have placed the earliest beginnings of human speech at anywhere from 300,000 to 27 million years ago, this would seem an impossible question to try to answer. In societies lacking widespread literacy (which is to say, all human societies before modern times), the standardizing effects of which slow down rates of linguistic change, languages evolve extremely rapidly. It took only a few hundred years for classical Latin to morph into medieval French, no 8th-century speaker of which would have understood anything said by Julius Caesar. What the languages spoken on earth were like 50,000 years ago, let alone what a single ancestor-tongue might have been like 250,000 to 26,950,000 years before them, is not even faintly imaginable.

And yet linguists not only can seek to reconstruct unknown languages, they do it all the time. Although such endeavors are extremely complicated and usually highly controversial, the theory behind them is elementary. Basically, it consists of taking a series of diverging linguistic lines and following them back in time to a supposed point of common origin at which they converge.

Think of French and Latin. If we had no written Latin literature or records—no knowledge of Latin at all—it would be impossible to determine much about it on the basis of French alone. There would be no way, for example, of knowing from the French word for water, eau, what the Latin word was. But there are other contemporary languages—Italian, Spanish, Portuguese, Catalan, and Romanian, to name the most prominent—whose vocabulary and grammar make it obvious that they descend from the same parent language that French does. Once we know that water is acqua in Italian, agua in Spanish, água in Portuguese, aigua in Catalan, and apă in Romanian, we can reasonably deduce that it must have been something like aqua or agua in Latin (in fact, it was the former), and that it underwent sound changes in French and Romanian whose explanations need to be looked for.

This is a simple example involving a single word. If linguists had to reconstruct the entirety of Latin—its phonetics, vocabulary, verbal system, noun cases, syntax, etc.—on the basis of its daughter languages, the task would be extraordinarily difficult and would involve many guesses and mistakes. Nevertheless, a rough approximation of it might be arrived at.

Now let’s go beyond the Latin or Romance family of languages to a larger grouping, the existence of which was first recognized in the 19th century. Here are some more words for water: German, Wasser; ancient Greek, hydor; Sanskrit, udra; Hittite (a long extinct language once spoken in Turkey), watar; Tocharian (another extinct language spoken in central Asia), war. Taken as a group, all these seem to be cognates, although they also seem to have nothing to do with Latin aqua. But wait a minute: Latin has another word, unda, meaning “wave” (whence French onde, Italian onda, Romanian undă), which is akin to Lithuanian vanduo, “water,” which is in turn like Russia voda, which brings us back to the sound of English “water.” All of these languages belong to what is known as the Indo-European family, whose members, even before the discovery of the Americas, were spread over much of the globe, from Ireland to the Indian subcontinent.

There is no written record, as there is for Latin, of the parent language, known as Proto-Indo-European, from which the Indo-European languages all descended. Yet, thought to have been spoken in Anatolia or the steppes of southern Russia some 5,000 to 6,000 years ago, it can be tentatively reconstructed from its offspring just as Latin could have been—and so indeed it has been by linguists, who agree on most of its essentials. The Proto-Indo-European word for water, they think, was wodr, the plural of which was odens.

Proto-Indo-European, whose existence no one today doubts, was, as we have said, a 19th-century idea. It took a number of 20th-century linguists to go a step further and ask: if there is a family of languages that descends from Latin, and a larger family to which Latin belongs that descends from Proto-Indo-European, can we go back even further in time and identify a still larger family to which Proto-Indo-European belonged? And if so, what was the parent language in this case?

The first to propose that there was such a language, which he labeled Nostratic (from Latin nostra, “ours”), was the Danish linguist Holger Pedersen (1867-1953). In a 1903 article in which he pointed out affinities between the languages of the Indo-European and Turkish families, Pedersen wrote: “Very many language stocks in Asia are without doubt related to the Indo-Germanic [i.e., Indo-European] one; this is perhaps valid for all those languages which have been characterized as Ural-Altaic.” The Ural-Altaic family is a huge one (whose unity is itself controversial) that includes, besides the Turkic tongues, Hungarian, Finnish, Mongolian, and various Siberian languages that spill over into North America in the form of Eskimo-Aleut. Moreover, Pedersen continued, “the Nostratic languages occupy not only a very large area in Europe and Asia but also extend to within Africa; for the Semitic-Hamitic languages are in my view without doubt Nostratic.” The Semitic-Hamitic languages, more commonly known today as Afro-Asiatic, include Hebrew, Arabic, Amharic (the main language of Ethiopia), Berber, and ancient Egyptian.

Subsequently, the Nostratic hypothesis was taken up and developed at great length by other linguists, most notably the Russians Vladislav Illich-Svytich, Sergei Starostin, and Aharon Dolgopolsky (who later settled in Israel and taught at Haifa University) and the Americans Joseph Greenberg and Allan Bomhard. Yet not only is the validity of Nostratic widely questioned on various methodological grounds but the Nostraticists themselves disagree among themselves on which of the world’s language families to consider Nostratic and which not to. The Kartvelian languages of the Caucasus, the Dravidian languages of southern India, the Cushitic languages of northeast Africa, and even the Na-Dene languages of North America have been embraced by some Nostraticists and rejected by others.

If there ever was such a thing as Nostratic, in which the conjectured word for water is wete, it may have been spoken some 20,000 years ago. But why stop there? Why not look for an even larger and earlier family, of which Nostratic was itself a descendant, that may have been the progenitor of most, or even all, of the languages ever spoken? Why not, in other words, take up the gauntlet of monogenesis and look for the original human language that paleolinguists like the Americans Merrit Ruhlen and John Bengtson have called “Proto-Human” or “Proto-Sapiens”? For “water,” Ruhlen has already hunted down, to take but a few of the examples cited by him, Nyimang kwe and Mangbeto éguo from Africa; Ainu wakka from Japan; Quileute kwaya and Shuswap kwo from North America; Quechua yaku and Wanana ko from South America; even Yareba ogo and Awyu okho from Australia.

One can put all of this down, of course, to sheer coincidence. Examine enough of the world’s 6,000 languages and their predecessors and you’re bound to come up with some that have words for water (and for everything else) with the sounds that you’re looking for. The chances of reliably reconstructing what the workers on the Tower of Babel called for when they were thirsty are not great. If I were God, I wouldn’t worry about everyone learning to speak Proto-Sapiens again. English is enough of a problem.

Got a question for Philologos? Ask him directly at [email protected].