Big Tech’s AI Datacenter Investments Might Be In Big Trouble
Chinese open models like GLM-5.2 and DeepSeek-V4 now rival frontier AI at a fraction of the cost, and that could strand the data center bet hyperscalers made.
- Hyper-scalers spent over $300 billion on AI datacenters in 2025, with $400 billion projected by 2027, but Chinese open models DeepSeek-V4 and GLM-5.2 now rival frontier AI at 80% lower inference cost.
- DeepSeek-V4 achieves parity with GPT-5 Turbo on MMLU-Pro and MATH-500 using multi-head latent attention, reducing context-window compute by 50%.
- GLM-5.2 activates only 15% of its parameters per token via mixture-of-experts, cutting GPU requirements per query by 60% compared to GPT-5.
- Gartner warns 30% of AI datacenter capacity commissioned through 2026 could become uneconomical if open models gain mainstream enterprise adoption.
- Microsoft, Google, and Amazon are now hosting DeepSeek-V4 and GLM-5.2 on their cloud platforms, signaling internal acknowledgment that the proprietary model advantage is eroding.
Hyper-scalers—Amazon, Google, Microsoft, and Meta—collectively committed over $300 billion in 2025 alone to new AI datacenter capacity, betting that proprietary frontier models would drive insatiable demand for compute. Instead, Chinese open models have leapfrogged in efficiency, delivering comparable benchmark scores on reasoning, code generation, and multimodal tasks while using substantially fewer GPUs per query. DeepSeek-V4, for instance, reportedly costs 80% less to serve than GPT-5 Turbo, according to leaked internal benchmarks, while GLM-5.2 achieves near-identical performance on MMLU-Pro and MATH-500.
This shift matters now because datacenter construction cycles last three to five years, but the underlying demand assumptions are already eroding. Companies that signed 20-year power purchase agreements and ordered billions in NVIDIA H100s are waking up to a scenario where inference workloads shift to cheaper open models that run on far less infrastructure. “It’s like building a fleet of supertankers when the world suddenly discovers sailboats are faster,” said one independent analyst.
The key details are stark. GLM-5.2, developed by Tsinghua University and Zhipu AI, uses a mixture-of-experts architecture that activates only 15% of parameters per token, slashing compute needs. DeepSeek-V4, from the eponymous Hangzhou lab, employs multi-head latent attention to halve context-window costs. Both are released under permissive open licenses, allowing any company to deploy them on commodity hardware. Meanwhile, hyperscaler capital expenditure is projected to hit $400 billion by 2027, much of it sunk in concrete, cooling, and power infrastructure that assumes premium-priced proprietary models remain the default.
Analysis suggests the stranded asset risk is real and underappreciated. Gartner estimates that 30% of AI datacenter capacity built through 2026 could become uneconomical if open model adoption accelerates. The shift also threatens NVIDIA’s monopoly on training silicon: if inference dominates and open models need less compute, demand for flagship GPUs may soften. “The entire stack—from chips to datacenters to cloud margins—is predicated on scarcity of frontier AI,” said a semiconductor strategist. “Open models invert that assumption.”
Outlook: hyperscalers are quietly diversifying. Microsoft has started offering DeepSeek-V4 on Azure; Google is integrating GLM-5.2 into Vertex AI. But the elephant in the room is whether the $100B+ of datacenter investments already under construction can be repurposed for other workloads. If not, Big Tech may face billions in writedowns by early 2028. Investors should watch for the next quarterly earning calls where terms like “capacity utilization” and “capital allocation efficiency” will be code for how badly the AI datacenter bet is wobbling.
Frequently Asked Questions
DeepSeek-V4 and GLM-5.2 are large language models developed in China that achieve performance comparable to frontier models like GPT-5 and Gemini Ultra. They are released under open-source licenses and use advanced architectural innovations such as mixture-of-experts and multi-head latent attention to drastically reduce compute costs.
Hyper-scalers like Amazon, Google, and Microsoft have invested hundreds of billions in AI datacenters that assume proprietary frontier models require massive compute. If inference shifts to cheaper open models, much of that capacity could become uneconomical, leading to stranded assets and potential writedowns.
According to leaked benchmarks, DeepSeek-V4 costs about 80% less per query to serve than GPT-5 Turbo. GLM-5.2 requires 60% fewer GPUs per inference call due to its sparse activation design.
A stranded asset is an infrastructure investment—such as a datacenter or power purchase agreement—that becomes prematurely uneconomical or underutilized due to changes in technology or demand. For AI datacenters, this could occur if open models reduce the need for expensive, purpose-built facilities.
The largest hyperscalers include Amazon Web Services, Microsoft Azure, Google Cloud, and Meta. Together they account for the majority of global cloud and AI infrastructure spending.
Hyperscalers are beginning to host open models on their cloud platforms—Microsoft added DeepSeek-V4 to Azure, Google integrated GLM-5.2 into Vertex AI—and are exploring how to repurpose existing datacenter capacity for other workloads like traditional cloud compute or training models that require less power.
Topics
Original source
www.forbes.com
Discussion
Join the discussion
Sign in to post a comment or reply.
No comments yet. Be the first to share your thoughts!