When Technology and Translation Come Together: Coding the Stories for Curious Minds to Discover

Since 84000 launched our vast Reading Room more than thirteen years ago, the  multi-layered database of Buddhist terms, places, people, and other information found in the collection of the Buddhist Kangyur and Tengyur published texts has continued to grow. It now contains more than four million nodes, or data points, across 350 translations—an average of 1,200 nodes per text.

“Buddhist texts are rich in the sense that each text encodes significant knowledge. For our technical team, the markup process is about illuminating the knowledge that’s embedded in the text. When we identify a phrase, it expresses the three dimensionality of the text,” said Jeff Wallman, 84000 director of technology. He is referring to the contextual knowledge behind the text that is revealed by linking keywords to a carefully constructed glossary made up of all the data points. The result is an interwoven knowledge base for users to easily discover the meaning and context of terms within the texts.

“The extensive encoding of the text is one of the many ways we create authentic translations; we are accurately capturing the translations in all their complexity and beauty,” Wallman said.

Celso Wilkinson, 84000 publications manager who leads a team of four digital editors, described his work as encoding knowledge. The translators initially identify which terms require further explanation, and the technical team gets busy encoding every term, linking pieces of data together. 

“We also experiment with different ways to show the root text and commentaries. We strive to make the entire corpus organized and accessible for research and exploration,” Celso explained. “We then give the reader options for viewing it directly online, downloading it as a PDF, or reading it on their tablet as an EPUB version.”

A member of 84000 since 2017, Celso also works on the 84000 translation memory project, which provides a resource for translators who may be working on a new version of a similar publication, to review the source text side by side with previous translations on a sentence-by-sentence basis. Celso, also a translator and practitioner, is continually amazed at the vastness of the Kangyur and Tengyur collections.

“I take joy in seeing how the material in the Kangyur can be surprising and inspiring in unexpected ways,” said Celso. He is most inspired by the narratives found within the texts. “Stories of the Buddha’s life and his students are sprinkled like hidden gems throughout the Kangyur and you never know when you’re going to find some interesting detail.”


Posted: 26 Jun 2024