As anticipated, the “magic” news was not GPT-5, but the introduction of a new iteration of the GPT-4 language model for the OpenAI chatbot.
OpenAI has launched GPT-4o, a new version of the famous model that powers its main product, ChatGPT.
During a live announcement yesterday, Mira Murati, the company’s CTO, highlighted that the updated model is “noticeably faster” and explained how the “o” in the name stands for “omni“, indicating advances in multimodality and improved “skills in dealing with text, video and audio.”
OpenAI announced that GPT-4o’s features “will be rolled out iteratively,” but its text and image capabilities will be available immediately in ChatGPT and will be accessible to all users for free, with paid users enjoying “up to to five times the capabilities of free users.”
Featuring greater responsiveness and the ability to handle images, foreign languages and emotional recognition, GPT-4o is designed for more fluid and personalized human-machine interactions, while “memorizing” past conversations.
Scarlett, is that you?
Sam Altman, CEO of OpenAI, described GPT-4o as “natively multimodal“.
This new model has a greater ability to generate content or understand commands through voice, text or images, leading above all new features to voice mode by ChatGPT.
Now the chatbot can act as a voice assistant responding in real time and observing the surrounding environment: abilities that immediately evoke a reference to the film “Her” in which Scarlett Johansson lends her voice to an AI with similar traits.
In the past, the model’s vocal mode was limited, capable of responding to a single prompt at a time and only operating based on what it was allowed to hear; required an average latency of 2.8 seconds (GPT-3.5) or 5.4 seconds (GPT-4), due to the need to use separate models for audio transcription and speech synthesis.
GPT-4o integrates these processes, allowing for more efficient management of inputs and outputs: it features a delay of only 232 milliseconds between question and answerwhich brings it considerably closer to human times.
During a live demonstration, GPT-4o also demonstrated its ability to provide suggestions on mathematical problems, analyze computer code and interpret emotions from facial expressions.
Despite minor inconveniences, such as misinterpretations of images and unsolicited initiatives, the potential of GPT-4o has proven to be highly cutting-edge.
In security termsOpenAI has integrated preventative measures into its design, including filtering training data and improving model behavior through post-training.
Additionally, the company worked with external experts to identify and mitigate risks associated with new features, such as voice output.
Delegation for a better world
Before yesterday’s launch, there were conflicting reports on expectations regarding what OpenAI would announce: an AI search engine to compete with Google, Perplexity integration, or even a new and improved model, GPT-5.
OpenAI has it anyway operated strategically releasing these news shortly before Google I/O, the main conference of the Mountain View giant, which sees the launch of various AI products from the Gemini team and scheduled for today at 7pm.
In a blog post after the live event, Altman reflected on OpenAI’s journey, acknowledging a shift in the company’s vision. While the initial goal was to “create all kinds of benefits for the world,” Altman indicated that the focus is now on making advanced AI models available to developers via paid APIs, allowing third parties to “use them to create all kinds of amazing things that we will all benefit from.”
GPT-4o features will be implemented directly into ChatGPT, with an alpha voice mode soon to be available to ChatGPT Plus subscribers.
Altman added on
In addition to GPT-4o, OpenAI announced improvements to the ChatGPT web interface and the launch of adesktop apps for Mac, with a Windows version expected later this year.
Additionally, some features previously reserved for premium subscribers will now be available for free, including the opportunity to access the GPT Store to create and share custom chatbots.
#OpenAI #presents #GPT4o #ChatGPT