
Traded ValueAnother major factor is strong structural growth in demand, and this one is clearly satisfied.
The most straightforward reason is Nvidia’s need to keep upgrading its “token factory” hardware to deliver higher token throughput. This drives extremely fast generational upgrades in both HBM bandwidth and HBM capacity (size), leading to exponential demand growth.
As concluded in the earlier AI semiconductor analysis, token throughput equals HBM size multiplied by HBM bandwidth, and this roughly doubles with each new generation. On average, HBM size per GPU is growing more than 40% every year.
This demand curve is rising so steeply that it is very difficult for the supply side to keep up. Even with 14% growth in wafer production and 9% improvement in memory density, the overall $Roundhill T-Rex 2X Long DRAM Dly TrgtETF(RAM.US)DRAM supply growth simply cannot match the pace of AI-driven HBM demand.
In the hardware architecture, the attention stage’s KV cache requires both extremely high bandwidth and very large memory capacity. This gives HBM a unique and irreplaceable position. Even if HBM prices rise three to five times, the extra token throughput gained by investing in more HBM is still far more cost-effective than spending the money anywhere else.
Other memory technologies, such as SRAM, HBF, CXL, and PIM, are currently unable to compete directly with HBM in the critical KV cache and attention workloads. There is no viable replacement in sight for at least the next five years, and possibly much longer.
@Bridge Buzz SG
The copyright of this article belongs to the original author/organization.
The views expressed herein are solely those of the author and do not reflect the stance of the platform. The content is intended for investment reference purposes only and shall not be considered as investment advice. Please contact us if you have any questions or suggestions regarding the content services provided by the platform.


