News Release

Exploring the past: Computational models shed new light on the evolution of prehistoric languages

Peer-Reviewed Publication

Linguistic Society of America

A new linguistic study sheds light on the nature of languages spoken before the written period, using computational modeling to reconstruct the grammar of the 6500-7000 year-old Proto-Indo-European language, which is the ancestor of most languages of Eurasia, including English and Hindi. The model employed makes it possible to observe evolutionary trends in language over the millennia. The article, “Reconstructing the evolution of Indo-European grammar,” authored by Gerd Carling (Lund University) and Chundra Cathcart (University of Zurich) will be published in the September 2021 issue of the scholarly journal Language. A link to the article may be found at:

In the article, Carling & Cathcart use a database of features from 125 different languages of the Indo-European family, including extinct languages such as Sanskrit and Latin. Features include most of the differences that make the languages difficult to learn, such as differentiation in word order (the girl throws the stone in English or caitheann an cailín an chloch “throws the girl the stone” in Irish), gender (the apple in English or der Apfel in German), number of cases, number of forms of the verb, or whether languages have prepositions or postpositions (to the house in English but ghar ko “house-to” in Hindi). With the aid of methods adopted from computational biology, the authors use known grammars to reconstruct grammars of unknown prehistorical periods.

The reconstruction of Indo-European grammar has been the subject of lengthy discussion for over a century. In the 19th century, scholars held the view that the ancient written languages, such as Classical Greek, were most similar to the reconstructed Proto-Indo-European language. The discovery of the archaic but highly dissimilar Hittite language in the early 20th century shifted the focus. Instead, scholars believed that Proto-Indo-European was a language with a structure more similar to non-Indo-European languages of Eurasia such as Basque or languages of the Caucasus region.

The study confirms that Proto-Indo-European was similar to Classical Greek and Sanskrit, supporting the theory of the 19th century scholars. However, the study also provides new insights into the mechanisms of language change. Some features of the proto-language were very stable and dominant over time. Moreover, features of higher prominence and frequency were less likely to change.

Though this study focused on one single family (Indo-European), the methods used in the paper can be applied to many other language families to reconstruct earlier states of languages and to observe how language evolves over time. The model also forms a basis for predicting future changes in language evolution.       


The Linguistic Society of America (LSA) publishes the peer-reviewed journal, Language, four times per year. The LSA is the largest national professional society representing the field of linguistics. Its mission is to advance the scientific study of language.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.