ClareNow

Physical AI Hits A Data Labeling Wall That Only Cash Can Fix

Physical AI raised $10B+ in 2025, but robots still train on under 5,000 hours of real-world data. Who's funding the race to fix it.

Josipa Majic Predin, Contributor Forbes 29 Jun 2026, 13:35 4 min read 7/10

Physical AI Hits A Data Labeling Wall That Only Cash Can Fix

Key Takeaways

Physical AI startups raised over $10 billion in venture funding globally in 2025, a record for the sector.
Robots training on fewer than 5,000 hours of real-world sensorimotor data — orders of magnitude less than large language models (trained on trillions of tokens).
Labeling one hour of robot manipulation video costs up to $500, making data collection the costliest component of physical AI development.
Scale AI launched a dedicated physical AI data labeling division in 2025, signing contracts worth hundreds of millions with robotics firms.
Synthetic data companies like Replica AI and Seismic raised over $200 million combined in 2025 to build simulated environments for generating training data.

Physical AI raised over $10 billion in 2025, but the robots at the heart of the sector still train on fewer than 5,000 hours of real-world data — less than a human toddler accumulates in a year. That data scarcity has become the single biggest bottleneck holding back embodied intelligence, and a growing number of investors and startups are betting that only massive cash infusions can break the logjam. The race to label the physical world has begun, and it will determine whether the promise of general-purpose robots becomes reality or remains stuck in labs.

Physical AI — machines that perceive, reason and act in the physical world — encompasses everything from warehouse robots and autonomous vehicles to humanoid assistants. The category attracted more than $10 billion in venture funding in 2025 alone, a record haul that underscores investor conviction that the next AI frontier is not just language but motion. Companies like Figure AI, Skild AI, Covariant and Agility Robotics have each raised hundreds of millions, pushing valuations into the billions. Yet a dirty secret haunts the hype: training data is vanishingly scarce.

Unlike large language models, which feast on the internet's text and image troves, physical AI requires real-world sensorimotor data — torque readings, joint angles, tactile feedback, navigation decisions — that must be painstakingly collected and labeled by humans. A single robot manipulation task might need thousands of demonstrations. Total publicly available physical interaction data is measured in hours, not petabytes. The result is a data labeling wall: the bottleneck between ambition and deployment.

Why now? Several forces converged in 2025. First, foundation model architectures that work for vision and language are being adapted for robotics, creating an immediate hunger for diverse training data. Second, traditional simulation-to-real transfer has proven insufficient — noise, friction and unexpected objects break policies trained purely in silico. Third, a handful of specialized labeling firms and cloud-scraping operations have begun scaling human-in-the-loop pipelines specifically for physical AI, but budgets are stratospheric. Labeling a single hour of robot manipulation video can cost upwards of $500.

The funding race is now targeting the data problem directly. Scale AI, the data labeling giant valued at $14 billion, launched a physical AI division in early 2025 and has signed contracts worth hundreds of millions with leading robotics startups. Startup Datasaur raised $45 million for a platform that lets robot operators label teleoperation data in real time. At the same time, synthetic data companies such as Replica AI and Seismic are building massive simulated environments specifically designed to generate labeled training data, betting that the wall can be cracked with digital twins.

Industry insiders warn that the current trajectory is unsustainable. "Without a step change in data volume, physical AI will plateau at the toy-robot stage," a senior engineering lead at one of the top three robotics firms told Forbes on background. "We need orders of magnitude more real-world interaction data, and that means investing in fleets of robots that collect data 24/7, plus armies of labelers." The math is sobering: to reach 10 million hours of training data — a comparable scale to language models — the physical AI sector would need to spend roughly $5 billion on labeling alone at current rates.

Looking ahead, expect a wave of consolidation and new financing vehicles dedicated to data infrastructure. Several venture firms are already structuring funds specifically for physical AI data collection, mirroring the early days of autonomous vehicle (AV) sensor fleets. Synthetic data quality is improving rapidly, but real-world verification loops remain mandatory. The next 18 months will determine whether the data labeling wall is a temporary construction delay or a structural limit to how smart robots can become. Unlocking physical intelligence may ultimately hinge less on algorithms and more on the grubby, expensive work of marking up the real world.

Frequently Asked Questions

Physical AI refers to machines that can perceive, reason and act in the physical world, including humanoid robots, warehouse automation systems and autonomous vehicles. It combines sensorimotor control with artificial intelligence.

Physical AI requires real-world sensorimotor data — such as torque, joint angles and tactile feedback — that must be manually labeled by humans. Unlike text or images, this data is scarce and expensive to capture, with fewer than 5,000 hours of training data available for most robotic systems.

Physical AI startups raised over $10 billion in venture funding globally in 2025, making it one of the most heavily funded sectors in AI. Companies like Figure AI, Skild AI and Covariant received significant investments.

Scale AI launched a dedicated physical AI division. Other startups include Datasaur, which raised $45 million for real-time labeling, and synthetic data companies like Replica AI and Seismic that generate simulated training environments.

Synthetic data from simulated environments is improving rapidly and helps scale training, but most experts agree real-world verification loops are still required. Physical AI currently needs both simulated and real human-labeled data to progress.

Labeling one hour of robot manipulation video can cost up to $500 due to the need for human expertise and equipment. At current prices, reaching 10 million hours of training data would cost around $5 billion.

Original source

www.forbes.com

Read original

Discussion

Join the discussion

No comments yet. Be the first to share your thoughts!