The linguistic abilities of artificial intelligence began to change in 2018, when OpenAI demonstrated that a machine learning model, known as a ‘transformer’, could generate surprisingly coherent chunks of text when given an initial string. Computer scientists had spent decades trying to write programs to handle the language in all its complexity and ambiguity. The OpenAI model is known as Generative Pretrained Transformer or, affectionately, GPT. He steadily improved as he was fed increasing amounts of data gleaned from books and the Internet (with whose permission?), until he was able to hold compelling conversations and answer a wide range of questions.
Robots with a plan. What can go wrong?
In early 2022, Hausman and Ichter, then at Google, along with Levine, Finn, and others, showed that LLMs could also be the basis of robotic intelligence. Although LLMs cannot interact with the physical world, they contain a lot of information about objects and scenes thanks to the vast scope of their training data. Although imperfect (as someone who understands the world simply by reading about it), that level of understanding can be enough to giving robots the ability to devise simple action plans.
Hausman and company hooked up an LLM to a one-armed robot in a mock kitchen at Google headquarters in Mountain View, California, giving it the ability to solve open-ended problems. When the robot was told, “I dropped my Coke on the table,” it used LLM to come up with a sensible plan of action that included finding and retrieving the can, throwing it in the trash, and getting a sponge to clean it, all of which this without conventional programming.
The team later connected a visual language model, trained with both text and images, to the same robot to improve its ability to understand the world around it. In one experiment, they placed photos of different celebrities nearby and asked the robot to give a can of soda to Taylor Swift: “Taylor didn’t appear in any of the robot’s training data, but language vision models know what she looks like.” Finn explains, his long brown hair framing a wide smile.
That same year, just as ChatGPT was going viral, the team decided to demonstrate the robot at an academic conference in Auckland, New Zealand. They offered attendees the chance to control it in California by typing whatever commands they wanted. The public was impressed by the robot’s overall ability to solve problems, but interest also grew in the broader implications of ChatGPT.
LLMs can help robots communicate, recognize things, and make plans, but their most basic ability to act is limited by a lack of intelligence about the physical world. Knowing how to grasp an oddly shaped object is trivial for humans only thanks to a deep instinctive understanding of how three-dimensional things behave and how our hands and fingers work. The assembled robotics experts realized that ChatGPT’s extraordinary abilities could translate into something just as impressive in a robot’s physical abilities, if actions instead of words could be captured on a large scale and learn from them. “There was an energy in the air,” Finn remembers.
There have been signs that this may work
In 2023, Quan Vuong, another co-founder of Physical Intelligence, brought together researchers from 21 different institutions to train 22 different robotic arms on a series of tasks using the same model of transformer. In most cases, the new model was better than the one the researchers had developed specifically for their robot,” Finn recalls.
Just as humans learn throughout their lives to go from feeling around for objects in early childhood to playing the piano a few years later, feeding robots with much more training data could reveal extraordinary new abilities.
Expectations of a robotics revolution are also being fueled by the many humanoid robots being promoted by startups like Agility and Figure, as well as big companies like Hyundai and Tesla. The capabilities of these machines remain limited, but remote-controlled demonstrations may make them appear more capable, and their proponents promise big things. Recently, Elon Musk even suggested that humanoid robots could outnumber humans on Earth by 2040..
#journey #company #seeks #revolutionize #robotics