OpenAI’s recent financial disclosures paint a bleak picture for the industry’s economic model. The company missed internal revenue and user targets whilst projecting losses that could reach $14 billion in 2026.
According to The Wall Street Journal, CFO Sarah Friar has warned internally that ballooning compute costs may outpace incoming revenue, raising questions about the company’s ability to fund future computing contracts.
These headlines are symptoms of a broader structural dysfunction across the sector, where companies inflate valuations to unprecedented levels and buy compute capacity far beyond what they can efficiently deploy. At an $852 billion valuation, OpenAI’s revenue growth cannot keep pace with the IT infrastructure costs required to justify that figure.
AI and Financial Inclusion Strategist at io.net.
OpenAI generates revenue primarily through API access fees and ChatGPT subscriptions. But every user query costs GPU time, meaning margins on inference are razor-thin or negative.
Revenue scales with subscribers, yet compute costs scale with usage intensity. Billions go toward training next-generation models that won’t produce returns for months or years.
Yet while these companies burn through capital acquiring GPU capacity they cannot profitably deploy, the hardware itself sits largely unused, processing power that could be advancing healthcare, education or development in markets that need it most.
The idle capacity paradox
The deeper problem is that the GPUs these companies are hoarding aren’t even being put to work. The prevailing narrative from major AI companies is one of scarcity, used to justify premium pricing. But Cast AI’s 2026 State of Kubernetes Optimisation Report, drawing on data from roughly 23,000 clusters across AWS, GCP and Azure, found that 95% of enterprise GPU capacity sits unused. Billions of dollars worth of compute, provisioned and paid for, rarely being put to productive work.
This means everyone outside the leading model companies and cloud computing giants is, in a sense, locked out. Startups, scaleups and institutions from London to Lagos cannot access compute at an acceptable price point. The hardware exists, as does the demand, but the current model has no mechanism to connect the two.
Why utilization stays so low
The reasons are structural. Companies buy GPUs for worst-case scenarios, the equivalent of building a motorway for rush hour and leaving it empty the other 23 hours of the day.
Then there’s the hoarding problem. In this market, owning GPUs is a signal to investors. Large enterprises keep buying well beyond what they actually need to operate. The hardware becomes a balance sheet asset first and a productive tool second.
Training runs make it worse. AI model training is intensive but periodic. A company might push its GPU cluster hard for a few weeks, then leave it largely idle until the next training cycle. There’s no incentive to let anyone else use that capacity in between. So it just sits there.
The result is that the biggest players hoard hardware they barely use, while everyone else gets priced out. The people this hurts most are the smaller teams who would actually put compute to productive use, developers in Nairobi or São Paulo building applications for markets that big tech has never prioritized. These are the teams that will turn AI into something practical and useful for ordinary people, and they’re locked out by pricing designed to subsidize someone else’s idle infrastructure.
Distributed alternatives
The persistent presence of idle capacity suggests the need for a different approach. Rather than concentrating GPU resources behind corporate walls, distributed compute networks connect underutilized hardware with developers who need it.
The mechanics are straightforward. GPU owners, whether data centers with spare capacity, companies with idle hardware, or individuals with powerful machines, plug into a network that verifies what each machine can do. When a developer needs compute, the network handles the matchmaking. A job requiring fast response times gets routed to high-performance hardware nearby. A batch training job that can run overnight gets spread across cheaper, more distributed machines. The developer doesn’t need to know where the hardware physically sits.
The orchestration layer groups compatible hardware into clusters, so a developer can deploy a workload across dozens of different machines as if they were one unified resource. The economics follow naturally. Idle GPUs represent sunk costs that could generate returns if put to work. Developers get access at lower price points. Hardware owners monetize assets that would otherwise sit dormant.
More importantly, distributed networks reduce the systemic risk of depending on a handful of centralized providers. When compute access isn’t tied to one company’s quarterly performance, the supply chain becomes more resilient.
The write-down clock is ticking
The AI industry is due a correction. Companies have loaded their balance sheets with GPU assets priced for full utilization. Actual utilization is 5%. At some point, the books have to reflect reality. When that reckoning arrives, compute access for the wider industry contracts with it. The companies that depend on centralized providers will feel it first. The ones already connected to distributed alternatives won’t.
The infrastructure exists. The demand exists. The only thing missing is a model that matches the two without requiring trillion-dollar valuations to function. That model is already being built. The remaining question is how many write-downs it takes before the rest of the industry pays attention.
This article was produced as part of TechRadar Pro Perspectives, our channel to feature the best and brightest minds in the technology industry today.
The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/pro/perspectives-how-to-submit