Google puts chatbot Gemini on pause: images show black Vikings and a female pope

Google's chatbot Gemini will temporarily stop generating images of people after the AI tool was found to produce inaccurate historical images. To break stereotyping in generative AI, Gemini was trained to prioritize diversity. After complaints from users, the model appeared to have gone too far in this regard.

On X the historically incorrect image generations of Gemini became shared en masse. For example, after a request to generate an “1800s U.S. Senator,” the AI tool showed an image of a native black woman. While the first female senator was a white woman in 1922. More historical portraits of Gemini yielded an Asian female German soldier in 1940, black Vikings and a female, light-skinned pope. With the focus on diversity in the AI model, Google hoped to provide a realistic reflection of the world, but in practice this did not work well.

Taking the image generation tool offline is a painful moment for Google. The company has been trying to catch up since the rapid rise of ChatGPT, from competitor Open-AI. In response to ChatGPT, Google introduced the AI assistant Bard last year. Chatbot Bard soon made a blunder: an example question was answered incorrectly.

In February, Google tried to make a positive impression by renaming Bard to Gemini and introducing an updated 1.5 model with new features. Perhaps a bit too hasty, because on Thursday Google was forced to take one of those functionalities, the image generator, offline again. Google responded to X with a short statement. “Gemini's AI image generation generates a wide range of people. That's generally a good thing, because people all over the world use it. In this case it misses the mark.”

Diversity issues

It is not known how the errors in Gemini's image generation occurred. Stereotyping and lack of diversity is a common problem in image generation. Last year the American newspaper showed The Washington Post showed that prompts such as “a productive person” resulted in images of only white men, while a prompt such as “someone in social services” showed only black people. A prompt is the data you provide that the chatbot responds to.

According to Margarat Mitchell, a former executive at Google's Ethical AI, the problem lies in the way diversity issues are addressed. Currently, measures for a more diverse model are often only applied afterwards, after training the AI model. An example of this is giving priority to a dark skin color. Another example of such a post-hoc solution is adding an ethnic diversity term to the user prompt after the fact, Mitchell explained to The Washington Post. In that case, a prompt like “portrait of a chef” might change to “portrait of a chef who is Indigenous.” These added terms are often pasted randomly.

According to Mitchell, the post-hoc solutions are “the easy and cheap way” to get diversity in chatbots. The diversity problem lies in the data with which the AI models are trained. “Instead of focusing on these after-the-fact solutions, we should focus on the data. We don't need to have racist systems if we manage the data well from the start.”

The AI imaging tools are typically trained on data collected from the internet. These training dates are mainly limited to the United States and Europe. This provides a Western perspective on the world, Safiya Noble, co-founder of the Center for Critical Internet Inquiry, a research center focused on critical research into the Internet and digital technologies, explained to The Was-hington Post.

To obtain a model without stereotypes, an attempt is often made afterwards to add more diversity to the model. According to Noble, this approach can lead to overcorrection and incorrect image generation, as was seen with Gemini. Google says it is busy solving the problems. In order to then be able to continue their catch-up on OpenAI.