The first time 24-year-old Omani student Mohd ElShami used an emoji was when he sent a smiley face — one with hearts for eyes — to his mom.
“She just got a smart phone, and I just got an iPhone, and she doesn’t read so easily. I wanted to say thank you to her for sending me food and I knew she would understand that symbol,” said ElShami, who spoke to BuzzFeed News by phone from Muscat, the capital of Oman, where he is studying to be an engineer. “For many of us in the Arab world we send emojis instead of text, even though I love the way the Arabic language appears online!”
ElShami said he had no idea that his country was part of a select group known as the Unicode Consortium which establishes the code used for all online languages — ranging from Arabic and English to the relatively new language of emojis. Alongside global powerhouses like Google, Facebook, Microsoft and IBM, Oman might seem like an odd addition to the small but intense consortium, a nonprofit Silicon Valley-based group that began in 1988 as a way to unify computer code for alphabets and other scripts. But like the other players involved, Oman wanted a role in shaping how language would be preserved online.
“Oman became involved in Unicode just a couple years ago,” said Mark Davis, president and co-founder of the Unicode Consortium. “Among the things they were very interested in was extending and expanding the capabilities of the Arabic script, and in expanding their role in developing technology in their region.”
Straddling the northern tip of Africa and the southern tip of the Middle East, Oman has earned a reputation as the Switzerland of that region. It has served as the site of secret negotiations between the United States and Iran, as well as for backchannel communications between the long-feuding leaderships of Iran and Saudi Arabia. Flush with money from oil sales and as shipping hub, Oman has poured resources into institutes dedicated to the preservation of the Arabic language and art.
As a member of the consortium Oman takes part in discussions on that new dumpling emoji, and how to best represent characters that can carry vastly different interpretations across languages.
Thomas Milo, a Dutch national with a longtime interest in Cyrillic and Arabic scripts, has represented Oman on the Unicode Consortium since early 2014. He says that while the sultanate of Oman had little interest in emojis, they were highly invested in the way the Arabic script, and particularly the unique fonts used in the Qur’an, were represented online.
A simple piece of code that gets created for one language can often have broad implications for other languages, said Davis, who as one of the founders of the consortium, said he could never have predicted the emergence of emojis as their own unique form of communication. (Hindsight, he said with a laugh, is 20/20).
Each language poses its own, unique quirks, said Davis. Hebrew and Arabic both require bi-directional scripts, which allow them to be read from right to left. Arabic required a special character, which could conjoin various characters together. To that purpose, Davis and his team invented the ZWJ character (pronounced zwidge).
"The ZWJ character started as a mechanism to glue two separate characters together to form a ligature or force a cursive appearance,” said Davis, adding that it was first, and most often used in Arabic scripts. “It is also now used for producing more complex emoji, as in different family groupings.”
But if Arabic had its challenges, the unique script of the Qur’an was its own minefield. Milo was already involved in developing font technology to render Arabic when an Omani government minister reached out to him regarding some work he had done on the Arabic script through DecoType, which Milo partners with in the Netherlands.
“The Korans that one finds on the web today either deviate from the prescribed spelling or consist of scanned images,” said Milo. “They deviate because not all Koranic Arabic characters are available in Unicode, so unpredictable workarounds are applied.”
While Arabic designers across the Middle East are currently working on creating their own, unique typography, Milo’s work aims at standardizing a font that would allow for a full searchable Qur’an.
“Unicode inadvertently allows that the same visual Koranic characters are encoded in different ways. Let me give you an English example, based on the ‘double you.’ The normal spelling for William is with W. Imagine that the W can be actually encoded as UU. But also as VV. That means that there are 3 ways of encoding William, without a perceivable difference for the reader. As a result, a reliable search is impossible. With Arabic script this happens widely,” said Milo. “What is at stake is to make sure Arabic script culture - for those languages that use Arabic script - is maintained.”
Next month the Unicode Consortium will hold its first-ever workshop in Oman. Milo said that the workshop, which will be attended largely by students and regional technology experts, will use “Arabic functions as a metaphor representing scripts from various cultures.”
ElShami and his cousin, when reached by BuzzFeed News, said they hadn’t heard about the workshop but are excited to attend.
“There is a lot of pride in the Arabic language, and it is important how it will look online for the world to see,” said ElShami. He also said he was interested in submitting proposals for emojis to represent Arabic culture and food, adding that for many of his older family members the characters had become their main language of communication.
“I would love have an emoji for hummus, so that I could send it to my mom and request that,” said ElShami, whose mother is originally from Damascus. “The only problem is that it might look too much like a poop emoji… only Arabs might understand it!”
Thomas Milo partnered with the ministry, not an Omani cabinet minister's company. In addition, Milo is involved in developing font technology, not scripts.
Sheera Frenkel is a cybersecurity correspondent for BuzzFeed News based in San Francisco. She has reported from Israel, Egypt, Jordan and across the Middle East. Her secure PGP fingerprint is 4A53 A35C 06BE 5339 E9B6 D54E 73A6 0F6A E252 A50F
Got a confidential tip? Submit it here.