Building A Digital Rosetta Stone
February 12, 2013

Ancient Languages Reconstructed With Digital Rosetta Stone

Michael Harper for — Your Universe Online

A team of researchers has finally discovered and proven what it is in the Canadian linguistic make up which makes them say "aboot" instead of about.

More than just discovering little linguistic ticks, the researchers from the universities of British Columbia and Berkeley have created a sort of digital Rosetta Stone, capable of reconstructing and reproducing protolanguages. These ancient tongues are the most basal of all languages from which modern language was formed.

While these researchers say this new computer system can never replace the work of a talented human linguist, it can be used as an aid; a digital jumping off point to help these linguists begin the often tedious task of piecing together old and uncommon languages. These researchers will have their work published in the Proceedings of the National Academy of Sciences.

In a press statement, lead author and University of British Columbia Assistant Professor of Statistics Alexandre Bouchard-Côte said, “We´re hopeful our tool will revolutionize historical linguistics much the same way that statistical analysis and computer power revolutionized the study of evolutionary biology.

“And while our system won´t replace the nuanced work of skilled linguists, it could prove valuable by enabling them to increase the number of modern languages they use as the basis for their reconstructions.”

Included with this new computer system is an algorithm which is capable of determining how a word may sound in future generations. As a part of their work, the researchers also programmed the new system with the gradual changes made to words over the passing generations. The Canadian's knack of saying "aboot" rather than "about," (also known as the Canadian Shift”) is an example of the way we evolutionarily shift our language.

"It happens in all words with a similar sound,” said Bouchard-Côte, speaking to

By programming these kinds of shifts into the system, the algorithm can suggest how these languages could one day evolve, explaining to linguists how different words could sound in the future.

This new computer system goes deep into the nuts and bolts of language, analyzing not only the basic sounds, but the phonetic units of a sound as well. With this much information, Bouchard-Côté says this new computer system can operate with greater precision and at a greater scale than previous systems. The researchers claim their new computer system is about 85% accurate in recreating the “painstaking manual reconstructions” normally conducted by trained linguists.

In order to test this new system, the researchers reconstructed a set of protolanguages with over 142,000 word forms from 637 Austronesian languages. These languages include Fijian, Hawaiian and Tongan, and offer great challenges to linguists. These protolanguages rarely include written records of their words, making verifying the reconstructions difficult. However, these protolanguages can sometimes be matched against literary histories or ancient texts about the languages. One notable exception to this rule, notes the press release, is the Latin language, which was well-documented and acted as the foundation for many modern languages, such as French, Italian, Portuguese and Spanish.