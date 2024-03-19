Companies like OpenAI and Midjourney build chatbots, image generators and other artificial intelligence tools that operate in the digital world. Now, a startup founded by three former researchers from OpenAI is using the same technology development methods to build AI technology that can bypass the physical world.

Covariant, based in Emeryville, California, is creating ways for Robots pick, move and sort items. Its goal is to help robots understand what is happening around them and decide what they should do next. The technology also gives robots a broad understanding of the English language, allowing people to speak to them as if they were speaking to ChatGPT.

The technology, still developing, is not perfect. But it's a sign that the AI ​​systems behind online chatbots and image generators will also power machines in warehouses, on roads and in homes.

Covariant, backed by $222 million in financing, doesn't build robots. Create the software that powers the robots.

The AI ​​systems behind chatbots and image generators are called neural networks. By identifying patterns in large amounts of data, these systems can learn to recognize words, sounds, and images—or even generate them on their own.

Several companies are building systems that can learn from different types of data at the same time. For example, by analyzing a collection of photographs and the descriptions of those photographs, a system can capture the relationships between the two. You may learn that the word “banana” describes a curved yellow fruit.

OpenAI used that system to build Sora, its new video generator. By analyzing thousands of videos with descriptions, the system learned to generate videos when given a description of a scene.

Covariant, founded by Pieter Abbeel, a professor at the University of California, Berkeley, and three former students, Peter Chen, Rocky Duan and Tianhao Zhang, used similar techniques to build a system that powers robots. The company helps operate sorting robots in warehouses around the world. It has spent years collecting data—from cameras and sensors—that shows how these robots operate.

By combining that data with the huge amounts of text used to train chatbots, the technology gives the robot the power to handle unexpected situations. The robot knows how to pick a banana, even though it has never seen a banana. If you tell him “pick a banana,” he will know what it means. If you tell him “pick a yellow fruit,” he will understand that too.

The technology, called robotics fundamental model, or RFM, makes mistakes, just like chatbots. As companies train this type of system with increasingly larger collections of data, researchers believe it will improve rapidly. Typically, engineers in the past programmed robots to perform the same precise movement over and over again, but robots could not cope with random situations.

However, by learning from hundreds of thousands of examples of what happens in the physical world, robots can begin to handle the unexpected.

“What's in digital data can be transferred to the real world,” Chen said.