OpenAI has announced that ChatGPT Advanced Voice Mode now includes support for real-time video analysis. This functionality allows users to use their device’s camera to obtain information about the environment. It also analyzes the content displayed on the screen of a gadget.
The feature was introduced in May during the launch of GPT-4o, an artificial intelligence (AI) model designed to simultaneously process audio, text and images. This algorithm has become the underlying technology of ChatGPT and makes it easier to interpret query intents and intonations, recognize objects, and solve mathematical problems. Allows the bot maintain more fluid and natural conversations.
How does the new Advanced Voice Mode with ChatGPT vision work?
Those interested in activating Advanced Voice Mode with vision must select the soundtracks icon located in the ChatGPT query bar and click click on the camcorder button. The system will begin capturing video automatically. Users will be able to point the camera of their devices at any object and make all kinds of queries by voice.
The OpenAI team gave a presentation on the capabilities of the breakthrough. Showed the voice assistant a kit to prepare coffee and requested instructions for use. The AI responded with precise directions and some additional recommendations in real time.
The tool is also designed to examine information presented on the screen of a smartphone. ChatGPT can now identify elements of an image, analyze messages, explain configuration manuals, suggest solutions to math problems, and provide details on pre-installed programs. To use this function, simply select the “Share screen” option from the three-dot menu.
The features announced by OpenAI are similar to those contemplated by Google’s ‘Project Astra’ program. The project includes a series of AI-based conversational features configured to analyze videos in real time. The initiative is in the testing phase and is available only to a small group of Android users.
The addition of real-time video analytics in ChatGPT will be gradually rolled out to subscribers of the Plus, Pro and Team plans. The resource will be available in the iOS and Android versions of the chatbotwith restrictions in the European Union, Switzerland, Iceland, Norway and Liechtenstein.
Advanced Voice Mode with vision is integrated into a series of announcements that startup directed by Sam Altman is scheduled to close the year. Among the new features is OpenAI o1, a powerful advanced reasoning algorithm available for a monthly subscription of $200. In addition, the company has launched Canvas, a platform designed to facilitate writing and programming projects, and Sora, an innovative tool capable of generating hyper-realistic multimedia content from textual indications.
#ChatGPT #analyze #surroundings #content #screen #realtime #video #feature