Interacting with the Real World, AI Will Gain Physical Intelligence

The latest AI models are remarkably human-like in their ability to produce text, audio, and video on demand. However, so far these algorithms have largely remained assigned to the digital world, rather than the physical, three-dimensional world in which we live. In fact, whenever we try to apply these models to the real world even the most complex struggle to work adequately. — For example, just think how challenging it has been to create safe and reliable self-driving cars. Although pretending to be intelligent, they are not only unable to understand physics but also tend to see things that are not right, which leads them to make inexplicable mistakes.

This is the year, however, when AI will finally make the leap from the digital world to the real world we live in. Extending AI beyond its digital frontier requires reworking the way machines think, combining the digital intelligence of AI with the power of robotic machines. This is what I call “physical intelligence”, a new type of intelligent machine that can understand dynamic environments, deal with uncertainty, and make decisions in real time. Unlike the models used by conventional AI, physical intelligence is based on physics; in understanding the basic principles of the real world, such as cause and effect.

Such features allow physical intelligence models to interact and adapt to different situations. In my research group at MIT, we develop models of physical intelligence that we call fluid networks. In one experiment, for example, we trained two drones—one driven by a standard AI model and the other a fluid network—to find objects in the forest in the summer, using data taken by human pilots. While both drones performed equally well when given the task of doing what they were trained to do, when asked to find objects in different conditions – in the middle of winter or in an urban area – only the liquid network drone successfully completed its task. This experiment showed us that, unlike traditional AI systems that stop evolving after their initial training phase, fluid networks continue to learn and adapt from experience, just like humans do.

Physical intelligence is also able to interpret and execute complex commands taken from text or images, bridging the gap between digital commands and real-world execution. For example, in my lab, we developed a physically intelligent, sub-minute system that can iteratively design and 3D print small robots based on information such as “a robot that can walk forward” or “a robot that can hold. things”.

Other labs are also making important breakthroughs. For example, robotics startup Covariant, founded by UC-Berkeley researcher Pieter Abbeel, is building chatbots—like ChatGTP—that can control robotic arms on command. They have already secured more than $222 million in funding to develop and install sorting robots in warehouses around the world. A team at Carnegie Mellon University has also recently shown that a robot with just one camera and intuitive operation can perform powerful and complex movements – including jumping over obstacles twice its height and jumping over gaps twice its height – using a single reinforcement-trained neural network. reading.

If 2023 was the year of text-to-image and 2024 was text-to-video, then 2025 will mark the age of physical intelligence, with a new generation of devices—not just robots, but anything from electric grids to smart homes . —that can interpret what we tell them and perform tasks in the real world.

Source link

Leave a Comment Cancel Reply