Tikalon Blog is now in archive mode.
An easily printed and saved version of this article, and a link
to a directory of all articles, can be found below: |
This article |
Directory of all articles |
Ugaritic
July 16, 2010
One book that I have on my bookshelf is
Lost Languages by Andrew Robinson (McGraw-Hill, 2002). I bought my copy, a hard cover, first edition, in a used book bookstore at a great price. It's a shame that used book bookstores will not exist in just a few years; and
brick-and-mortar bookstores will not exist just a few years thereafter. What will really hurt is when paper books have disappeared. A lot of information is becoming more fragile, since it's now just
magnetic blips,
microscopic optical smudges on plastic, or
fleeting electrons in a piece of silicon. Perhaps all traces of our language and scholarship will have disappeared just a few centuries hence.
One chapter in the Lost Languages book is devoted to the
Phaistos Disk, a fired clay disk about six inches in diameter that's completely covered on both sides with stamped symbols arrayed in a spiral from center to edge. The disk is about 3,500 years old, and it was found in the
Minoan palace of
Phaistos. The language is unknown, although there are similarities to
Linear A and
Linear B. No other examples of the Phaistos script have been found, so it's unlikely that a translation can be made. There's no point of reference. Not so for the language,
Ugaritic, which has been automatically deciphered in a computer-assisted comparison to Hebrew.
Both sides of the Phaistos Disk (photo by Maksim).
No, a computer wasn't the first to decipher this dead language. Ugaritic inscriptions were discovered in 1928, and because of this language's similarity to Hebrew, it was manually deciphered in 1932 using common techniques. The manual decipherment was important, otherwise the computer people wouldn't know whether their program was working. The key to language decipherment is findings
cognates. Cognates are words in different languages that have a common etymological origin. For example, the English, "silver," and German, "silber;" or the Latin "argentum," French, "argent," and Italian, "argento."
A paper describing the computer approach, "A Statistical Model for Lost Language Decipherment," by Benjamin Snyder and Regina Barzilay of MIT, along with Kevin Knight of USC, will be presented at the Annual Meeting of the Association for Computational Linguistics, July 11-16, 2010.[1-2] The authors used a computer model of
Bayesian inference to statistically compare Ugaritic with Hebrew. As they write in their abstract,
"When applied to the ancient Semitic language Ugaritic, the model correctly maps 29 of 30 letters to their Hebrew counterparts, and deduces the correct Hebrew cognate for 60% of the Ugaritic words which have cognates in Hebrew."
Surprisingly, automated computer analysis is not common in attempts to decipher ancient languages. Linguists rely mostly on their intuition, but Snyder, et al. have demonstrated that much of this intuition can be coded into a computer program, especially when a relationship is suspected between the lost language and a known language. A 1999 attempt by others using a
Hidden Markov Model character substitution cipher correctly translated only 29% of the cognates. Unfortunately, only one-third of Ugaritic words have Hebrew cognates, so even 60% is not that good. It reminds me of
The Dukes of Hazard television episode in which Boss Hogg offers Sheriff Rosco P. Coltrane a cut of profits that were "ten percent of ten percent."
The authors point out that Google's translation program works for only 57 languages (The
Heinz subset?). Using principal features of their approach might allow an extension to thousands of languages. Personally, I'd like a translation of the
Voynich Manuscript.[3]
References:
- Benjamin Snyder, Regina Barzilay and Kevin Knight, "A Statistical Model for Lost Language Decipherment," To Appear, Annual Meeting of the Association for Computational Linguistics, July 11-16, 2010.
- Larry Hardesty, "Computer automatically deciphers ancient language," MIT Press Release, June 30, 2010.
- Voynich Manuscript Web Site.
Permanent Link to this article