AI Compute Hits Memory Wall: US and Chinese Tech Giants Race to Secure HBM Supply Chains

AI Compute Infrastructure Enters the “Storage-Compute Co-Design” Deep Water Zone: The Memory Supply Chain Emerges as a New Strategic Battleground for U.S. and Chinese Tech Giants
When NVIDIA CEO Jensen Huang emphasized within three days that “memory shortages will persist for several years,” this was no longer merely a technical warning—it was an authoritative verdict on the evolutionary trajectory of AI infrastructure. Recently, NVIDIA officially announced that its new Vera CPU will fully adopt SK Hynix DRAM; simultaneously, Huang held intensive meetings with senior Samsung executives, with a formal collaboration announcement imminent. TSMC, AMD, and Microsoft have likewise accelerated joint development efforts with HBM suppliers. These moves are not isolated incidents but rather clear signals pointing to a long-underestimated yet increasingly prominent systemic bottleneck: memory bandwidth and capacity—rather than transistor scaling—are now the primary physical constraints limiting improvements in large-model training and inference efficiency. The concentrated announcements by U.S. and Chinese tech giants regarding deep collaboration across the memory supply chain signify that competition in AI hardware has formally shifted—from isolated breakthroughs in “chip design and foundry manufacturing” to system-level integration under the “storage-compute convergence” paradigm.
Real HBM3 Capacity Shortfall: The Chasm Between Technical Specifications and Supply-Chain Reality
High-Bandwidth Memory (HBM) serves as the “bloodstream” for today’s AI accelerators. Take HBM3 as an example: its per-stack bandwidth has reached a staggering 1.2 TB/s—more than 2.5× higher than the prior-generation HBM2e—yet its mass-production difficulty has increased exponentially. According to the latest report from TrendForce, global HBM3 production capacity in 2024 can meet only ~65% of demand from AI chips, with over 80% of total capacity concentrated at two Korean firms—SK Hynix and Samsung. This structural imbalance directly results in two consequences: first, NVIDIA’s B200 platform has been forced to adjust customer delivery schedules due to HBM3 supply delays; second, cloud service providers’ procurement costs for HBM3 have surged over 40% year-on-year, with some orders carrying premiums up to 25%. Huang’s remark about a “multi-year shortage” reflects the cumulative effect of multiple process bottlenecks—including advanced packaging (e.g., TSV through-silicon vias, hybrid bonding), ultra-thin wafer thinning, and low yield rates in 3D stacking—even when fabs operate at full capacity, usable HBM3 stack output per 10,000 wafers remains less than one-third that of conventional DRAM. Memory is no longer just a supporting component—it has become the “hard constraint variable” defining the upper performance limit of AI chips.
Storage-Compute Co-Design: From “Chips Adapting to Memory” to “Architectures Defining Memory in Reverse”
In traditional semiconductor division of labor, memory vendors supply standardized products aligned with CPU/GPU specifications. Today, that paradigm is collapsing. The launch of the Vera CPU is highly symbolic: as NVIDIA’s first in-house general-purpose CPU, its architecture is deeply co-designed with SK Hynix’s HBM3-PIM (Processing-in-Memory) technology—embedding certain data preprocessing logic directly into the memory controller, thereby reducing memory access latency by 37% and improving energy efficiency by 2.1×. Similarly, AMD’s MI300X already supports an enhanced HBM3 ECC error-correction mode, while Microsoft’s Maia 100 chip features a jointly customized low-power HBM3 subsystem developed with Micron. Collectively, these cases reveal an emerging logic: AI chip architectures are actively “defining memory specifications in reverse”—bandwidth density, error-rate thresholds, thermal design power (TDP), and even physical package dimensions must be jointly locked down with memory vendors at the earliest stages of chip design. This “storage-compute co-design” goes far beyond interface compatibility, targeting fundamental optimizations in system-level energy efficiency and algorithmic convergence speed.
Accelerated Domestic Substitution: YMTC and CXMT Enter a Strategic Window of Opportunity
The global HBM3 supply crunch objectively presents China’s memory manufacturers with an unprecedented, non-replicable time window. Yangtze Memory Technologies (YMTC) has announced that its YMC 3.0-architecture HBM3 prototype samples have passed preliminary validation by NVIDIA; its innovative “wafer-level hybrid bonding” process enables stacking up to 12 layers. CXMT (ChangXin Memory Technologies), leveraging Phase II of its Hefei fabrication facility, is accelerating volume production of LPDDR5X and HBM2E—and has explicitly designated HBM3 as a top R&D priority for 2025. Notably, a domestic collaborative ecosystem is rapidly taking shape: NAURA’s etching equipment has entered the validation phase at YMTC’s HBM production line; Yongsi Electronics has achieved >92% yield in 2.5D/3D hybrid packaging for HBM3; and Cambricon’s MLU370 chip has initiated joint optimization with CXMT’s LPDDR5X modules. On the policy front, momentum continues to build—the Third-Phase National Integrated Circuit Industry Investment Fund explicitly identifies “advanced memory chips and supporting equipment” as a key investment focus. In the short term, this is a capacity catch-up race; in the long term, it is a contest for standard-setting authority in storage-compute co-design—whose player achieves full-stack, self-controlled HBM3 capability first will secure the entry point to influence the next generation of AI infrastructure.
Cross-Cycle Investment Logic: Revaluation of “Hidden Champions” in Equipment, Packaging, and Materials
The deep integration of the memory supply chain is giving rise to a new investment thesis. Foremost is the semiconductor equipment segment: HBM3 manufacturing places extreme demands on atomic layer deposition (ALD), high-precision etching, and wafer-bonding tools. NAURA and Toppan Photomasks (Tongjing Technology) have already achieved >35% domestic ALD equipment penetration—but specialized HBM bonding equipment remains in the early adoption phase, where technological barriers translate directly into valuation upside. Second is advanced packaging: HBM3 requires heterogeneous 2.5D/3D integration. JCET and Tongfu Microelectronics have received AMD certification for Chiplet packaging; a mere 1-percentage-point improvement in their HBM3 packaging yield can save downstream customers hundreds of millions of RMB in costs. Third is critical materials: HBM3 interposers rely on ultra-low-dielectric-constant (“low-k”) organic materials and ultra-high-purity silicon substrates—products from Shanghai Xinyang Semiconductor and Dinglong Co., Ltd. have already entered small-batch validation. Though these segments do not serve end markets directly, they constitute the “capillaries” enabling storage-compute co-design—and their technological breakthroughs offer both high predictability and long-term growth potential.
Conclusion: Memory Is Not a Supporting Actor—It Is the “New Silicon Foundation” of the AI Era
As the AI compute arms race enters its deep-water zone, it becomes increasingly clear: Moore’s Law—measured by transistor count—is slowing, while the “memory wall” governing data movement grows ever taller. The collective strategic bets placed by U.S. and Chinese tech giants on the memory supply chain reflect a shared consensus on a paradigm shift in computing: tomorrow’s AI chips will no longer be simple assemblies of “processor + memory,” but rather storage-compute fused systems fundamentally rearchitected around data flow. Within this transformation, SK Hynix’s and Samsung’s production advantages, YMTC’s and CXMT’s technological breakthroughs, and the relentless capabilities of China’s equipment and packaging enterprises collectively form a strategic network bearing directly on sovereignty over digital-age infrastructure. The “multi-year” memory shortage is, in fact, a golden window for reshaping the global semiconductor value chain. Whoever anchors the technological high ground in this wave of storage-compute co-design will truly hold the key to the AGI era.