The same HBM pricing power that drove Micron's profits to record highs is now giving its biggest customers a powerful incentive to use less of it.
The same HBM pricing power that drove Micron's profits to record highs is now giving its biggest customers a powerful incentive to use less of it.

Qualcomm, Nvidia, Apple, Cerebras and Google are developing chip architectures and algorithms designed to reduce reliance on high-bandwidth memory, as sustained supply tightness pushes costs higher across the AI supply chain.
"HBM supply is tight, it's expensive, and we don't use it," Cerebras Chief Executive Officer Andrew Feldman said in the company's first post-earnings call as a public company, positioning the wafer-scale chip maker's HBM-free design as a competitive advantage.
Micron Technology last week reported that supply had tightened further from three months ago, signing 15 new long-term agreements — most spanning five years with price floors — and extending its shortage outlook beyond 2027. Global memory sales surged 79% in 2024 to $165 billion and are projected to exceed $223 billion in 2025, according to industry data. SK Hynix, the leading HBM producer, sold out its entire 2026 production and is no longer accepting new orders for major memory products.
The dynamic creates a paradox for memory makers: the very pricing power generating unprecedented profitability is financing a wave of customer-led innovation aimed at reducing HBM dependency. If successful, these efforts could cap the long-term growth trajectory of a market that has become central to AI infrastructure.
Qualcomm's Bet on a Different Memory Architecture
Qualcomm at its June 2026 Investor Day unveiled a data center platform called Dragonfly built around what it calls high-bandwidth compute, or HBC. Rather than pairing a processor with stacks of HBM connected across an expensive silicon interposer — the standard approach used by Nvidia's H100 and B200 — Qualcomm places its processing cores directly beneath a DRAM stack, collapsing the distance data must travel. The company claims this delivers up to eight times more tokens per watt than traditional GPU configurations and six times the memory bandwidth per watt of HBM-based competitors, while eliminating the interposer entirely.
The architecture arrives as HBM has become one of the most constrained components in the AI supply chain. New memory fabrication facilities require $15 billion to $20 billion in investment and take several years to become operational, meaning shortages could persist through at least 2027, according to industry forecasts. Qualcomm said it has already secured the wafers and memory needed to reach its fiscal 2027 revenue target of $5 billion in data center sales.
Nvidia and Google Take the Software Route
Nvidia, the largest consumer of HBM for its AI accelerators, is also exploring ways to reduce memory demand. The company is adjusting elements of its next-generation Vera Rubin platform to lower overall memory requirements, according to reports. Nvidia's Grace CPU and Vera architecture represent an attempt to optimize the balance between compute and memory in systems where HBM now accounts for a growing share of total bill of materials.
Google in March published research on TurboQuant, a model compression method that significantly reduces AI model memory footprint with minimal impact on performance. The announcement triggered a sharp selloff in Micron shares, which fell nearly a third in a single session before recovering more than twofold as the market reassessed the timeline for such techniques to meaningfully affect HBM demand. The episode illustrated how sensitive memory stock valuations have become to any technology that threatens HBM's role in AI inference.
The Investment Calculus
For memory makers, the near-term outlook remains exceptionally strong. Micron's new long-term agreements lock in pricing above historical cycle peaks even at their contractual minimums, according to Futurum analyst Rolf Bulk. The company's guidance implies quarterly operating profit exceeding its best-ever full-year revenue from prior cycles.
But the structural risk is building. Apple this week raised prices on multiple Mac and iPad models between product cycles, explicitly citing memory chip cost increases — a move that signals end-user tolerance for higher prices has limits. When the world's largest technology companies are signing five-year supply agreements with price floors, they are simultaneously committing capital to engineering teams tasked with making those agreements less relevant over time.
Qualcomm shares trade at roughly 18 times forward earnings, a discount to Nvidia's 35 times, reflecting the market's skepticism about its ability to break into data centers dominated by incumbents. If its HBC architecture delivers on claimed efficiency gains, the savings in memory procurement alone could justify a re-rating. For Micron, the risk is that today's pricing power is sowing the seeds of tomorrow's demand destruction — a pattern the memory industry has seen before, though never driven by this magnitude of customer-side engineering investment.
This article is for informational purposes only and does not constitute investment advice.