Hot robotics startup Physical Intelligence says its new robotic brain can understand tasks it was never taught

Physical Intelligence: A Rising Star in Robotics

Physical Intelligence, the two-year-old San Francisco-based robotics startup that has quietly become one of the Bay Area’s most closely watched AI companies, released new research Thursday showing that its latest model can direct robots to perform tasks they were never explicitly trained to do — a capability that the company’s own researchers say caught them off guard.

Introducing Model π0.7: A Leap Towards General-Purpose Robotics

The new model, called π0.7, represents what the company describes as an early but significant step toward the long-sought goal of a general-purpose robot brain: one that can be directed toward an unfamiliar task, coached in simple language, and actually succeed. If the results hold up to scrutiny, they suggest that robotic AI could be approaching an inflection point similar to that seen in the field with large language models – where capabilities begin to accumulate in ways that exceed what the underlying data seems to predict.

Compositional Generalization: Breaking Free from Rote Memorization

The main argument of this novel approach is compositional generalization – the ability to combine skills learned in different contexts to solve problems that the model has never encountered. Until now, the standard approach to training robots has relied primarily on rote memorization: collecting data on a specific task, training a specialized model on that data, then repeating for each new task. According to Physical Intelligence, π0.7 breaks this pattern.

“Once you cross the threshold where you go from just doing exactly what you’re collecting the data for to remixing things in new ways,” says Sergey Levine, co-founder of Physical Intelligence and a professor specializing in AI for robotics at UC Berkeley, “the capabilities increase more than linearly with the amount of data. This much more favorable scaling property is something we’ve seen in other fields, like language and vision.”

Case Study: The Air Fryer Experiment

The most striking demonstration involves an air fryer that the model had virtually never seen in training. When the research team investigated, they found only two relevant episodes in the training data set: one in which another robot simply pushed the air fryer closed, and another in an open source data set where another robot placed a plastic bottle inside a fryer on someone’s instructions. The model had somehow synthesized these fragments, along with broader web-based pre-training data, to obtain a functional understanding of how the appliance was operating.

“It’s very difficult to determine where knowledge comes from, and where it will succeed or fail,” says Lucy Shi, a Physical Intelligence researcher with a PhD in computer science at Stanford. Still, without any training, the model made a passable attempt at using the appliance to cook a sweet potato. With step-by-step verbal instructions – essentially, a human guiding the robot through the task in the same way you would explain something to a new employee – the operation was completed successfully.

The Implications of Coaching Capabilities

This coaching capability is important because it suggests that robots could be deployed in new environments and improved in real time without additional data collection or model retraining.

Researchers do not hesitate to recognize the limits of the model and are careful not to get ahead of themselves. In at least one case, they point the finger squarely at their own team.

“Sometimes the failure mode is not from the robot or the model,” says Shi. “It’s our fault. Not being good at rapid engineering.” She describes an early air fryer experiment that produced a 5% success rate. After spending about half an hour refining how the task was explained to the model, the rate rose to 95 percent, she says.

Image credits:Physical Intelligence

Challenges and Future Prospects

The model is also not yet capable of executing complex, multi-step tasks autonomously from a single high-level command. “You can’t tell him, ‘Hey, go make me some toast,'” Levine says. “But if you go through it – ‘for the toaster, open this part, press this button, do this’ – then it tends to work pretty well.”

The team also recognized that there aren’t really any standardized benchmarks for robotics, making external validation of their claims difficult. Instead, the company measured π0.7 against its own previous specialty models (systems specifically designed and trained for individual tasks) and found that the generalist model matched their performance on a range of complex jobs, including brewing coffee, folding laundry, and assembling boxes.

Perhaps what is most remarkable about the research – if you take the researchers’ word for it – is not a mere demonstration, but the extent to which the results surprised them, people whose job it is to know exactly what the training data contains and therefore what the model should and should not be able to do.

“My experience has always been that when I know the content of the data in depth, I can sort of guess what the model will be able to do,” says Ashwin Balakrishna, a research scientist at Physical Intelligence. “I’m rarely surprised. But the last few months have been the first time I’ve really been surprised. I just bought a random set of gears and asked the robot, ‘Hey, can you spin this gear?’ And it worked.

Levine recalled the moment when researchers first encountered GPT-2, generating a story about unicorns in the Andes. “Where the hell did he learn about unicorns in Peru?” he said. “It’s such a strange combination. And I think seeing that in robotics is really special.”

Naturally, critics will point out an uncomfortable asymmetry here: the language models had the entire Internet to learn from. This is not the case for robots, and no intelligent incentive completely closes this gap. But when asked where he expects this skepticism, Levine points the finger at something else entirely.

“The criticism that can always be leveled at any demonstration of robotic generalization is that the tasks are rather boring,” he says. “The robot does not do a backflip.” He pushes back on this framework, arguing that the distinction between an impressive robot demonstration and a robotic system that generalizes is precisely the important point. Generalization, he suggests, will always seem less dramatic than a carefully choreographed stunt – but it is considerably more useful.

The paper itself uses careful hedging language throughout, describing π0.7 as showing “early signs” of generalization and “first demonstrations” of new capabilities. These are research results, not a deployed product.

When asked directly when a system based on these results might be ready for deployment in the real world, Levine declines to speculate. “I think there’s good reason to be optimistic, and things are certainly progressing more quickly than I expected a few years ago,” he says. “But it’s very difficult for me to answer this question.”

Investments and Future Outlook

Physical Intelligence has raised over $1 billion to date and was recently valued at $5.6 billion. A significant part of the investor enthusiasm around the company can be traced back to Lachy Groom, a co-founder who spent years as one of Silicon Valley’s most high-profile angel investors — backing Figma, Notion, and Ramp, among others — before deciding that Physical Intelligence was the company he was looking for. That pedigree helped the startup attract significant institutional funding, even though it declined to offer investors a timetable for commercialization.

The company is now reportedly in talks for a new trading round that would nearly double that valuation figure to $11 billion. The team declined to comment.

When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.

For more information, read the original article here.

“`

Tim Cook steps down as Apple CEO: Here’s a look at his 15-year legacy, from new products and services to China expansion

7 Practical OpenClaw Use Cases You Should Know About

Lachy Groom to back Indian startup Pronto at $200 million valuation, sources say

Two college kids raise a $5.1 million pre-seed to build an AI social network in iMessage

Hot robotics startup Physical Intelligence says its new robotic brain can understand tasks it was never taught

Physical Intelligence: A Rising Star in Robotics

Introducing Model π0.7: A Leap Towards General-Purpose Robotics

Compositional Generalization: Breaking Free from Rote Memorization

Case Study: The Air Fryer Experiment

The Implications of Coaching Capabilities

Challenges and Future Prospects

Investments and Future Outlook

Tim Cook steps down as Apple CEO: Here’s a look at his 15-year legacy, from new products and services to China expansion

7 Practical OpenClaw Use Cases You Should Know About

Lachy Groom to back Indian startup Pronto at $200 million valuation, sources say

Two college kids raise a $5.1 million pre-seed to build an AI social network in iMessage

Codex Settings

Accenture, Vodafone and SAP will pilot humanoid robots in the warehouse

Chef Robotics has escaped the robot kitchen graveyard and says it’s thriving – here’s why

GM’s Mikell Taylor to Lead Women in Robotics Breakfast at Robotics Summit

Robots break human records at Beijing half marathon

Arthur Erickson explains how Hylio drones address agriculture and other markets

LEAVE A REPLY Cancel reply

Useful Links

Latest News

7 Practical OpenClaw Use Cases You Should Know About

Lachy Groom to back Indian startup Pronto at $200 million valuation, sources say

Two college kids raise a $5.1 million pre-seed to build an AI social network in iMessage

Our Newsletter