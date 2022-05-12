Some 10 million people speak Quechua, but attempting automatic translation of emails and text messages into the most widely used family of indigenous languages ​​in the Americas was virtually impossible for a long time.

That changed on Wednesday, when Google added Quechua and other languages ​​to its digital translation service.

The internet giant says new artificial intelligence technology is allowing it to vastly expand its Translator’s repertoire of languages. The firm added 24 this week, including Quechua and other South American indigenous languages ​​such as Guarani and Aymara. Google also added several widely spoken African and South Asian languages ​​that haven’t been in popular tech products.

“We looked at languages ​​with very large underserved populations,” Isaac Caswell, a Google research scientist, told reporters.

The news announced at the California company’s annual I/O tech show could be celebrated in many corners of the world. But it will likely also draw criticism from those who have been frustrated with previous tech products that failed to pick up the nuances of their language or culture.

Quechua was the lingua franca of the Inca empire, which stretched from what is now southern Colombia to central Chile. Its situation began to decline after the Spanish conquered Peru more than 400 years ago.

The inclusion of Quechua among the languages ​​recognized by Google is a great victory for activists of that language such as Luis Illaccvanqui, a Peruvian who created the Qichwa 2.0 website, which includes dictionaries and resources to learn it.

“It will help give Quechua and Spanish the same status,” said Illaccanqui, who was not involved in the Google project and whose last name means “you are lightning” in Quechua.

Illaccanqui said that the translator will also help keep that language alive in a new generation of young people and adolescents “who speak Quechua and Spanish at the same time and are fascinated by social networks.”

Caswell called the news a “very big technological breakthrough” because until recently it was impossible to add languages ​​if researchers couldn’t find a large enough collection of texts online – such as digital books, newspapers or messages spread on social networks – of which their AI systems could learn.

America’s tech giants don’t have a long track record of making their language technology work well outside of richer markets, a problem that has also made it harder for them to spot dangerous disinformation on their platforms. Until this week, Google Translate worked for European languages ​​like Frisian, Maltese, Icelandic, and Corsican — each with fewer than a million speakers — but not East African languages ​​like Oromo and Tigrinya, which has millions of speakers.

The new languages ​​will be available this week. They won’t be understood by Google’s voice assistant yet, which will limit the service to text-to-text translation for now. The company said it is working on voice recognition and other capabilities, such as being able to translate signs by pointing the camera at them.

This will be important for languages ​​that are largely oral, such as Quechua, especially in the medical field, because many Peruvian doctors and nurses who only speak Spanish work in rural areas and “cannot understand patients who speak Quechua mainly,” Illaccanqui pointed out.

“The next frontier or challenge is to work around speech,” said Peruvian Aturo Oncevay, a machine translation researcher at the University of Edinburgh and co-founder of a research coalition that seeks to improve the technology of indigenous languages ​​in the Americas. “The native languages ​​of the American continent are traditionally oral.”

In its announcement, Google warned that the quality of translations in newly included languages ​​”is still a long way off” from that of other languages ​​it already has, such as English, Spanish and German, and stressed that models “will make mistakes and they will exhibit their own biases.” But the company only adds languages ​​if its AI systems meet a certain proficiency threshold, Caswell said.

“If there is a significant number of cases where it is seriously wrong, then we would not include it,” he added. “Even if 90% of the translations are perfect, but 10% are nonsense, that is a bit excessive for us.” .

Google said that its products already include 133 languages. The most recent 24 are the largest batch added since the company added 16 in 2010. What made the expansion possible is what Google calls a “zero attempt” or “zero resource” machine translation model, which learns to translate another language without having seen an example of it.

Meta, the parent company of Instagram and Facebook, introduced a similar concept last year called Universal Speech Translator.

Google’s model works by training a “single gigantic AI neural model” in about 100 different languages ​​for which there is a lot of data, and then applying what it learns to hundreds of other languages ​​it doesn’t know, Caswell said.

“Imagine that you are a great polyglot and then you just start reading novels in another language; they can start to piece together what it might mean based on their general language skills,” he noted.

He said the new group ranges from smaller languages ​​like Mizo, spoken in northeast India by about 800,000 people, to more widely spoken languages ​​like Lingala, used by about 45 million people in central Africa.

More than 15 years ago, in 2006, Microsoft received some praise in South America for software that translated familiar company menus and commands into Quechua. But that was before the current wave of AI advances in real-time translations.

Américo Mendoza Mori, a language researcher at Harvard University who speaks Quechua, said that the fact that Google has paid attention to this language gives it a necessary visibility in places like Peru, where Quechua speakers still lack many public services. . The survival of many of these languages ​​”will depend on their use in digital contexts,” he said.

Another linguist, Roberto Zariquey, said he was skeptical that Google could create an effective language revitalization tool for Quechua, Aymara or Guaraní without the close involvement of community groups in the region.

Languages ​​are closely linked to lives, cultures, ethnic groups and political organizations, said Zariquiey, a linguist at the Pontifical Catholic University of Peru. This should be taken into account, he stated.