--- title: "Report: NVIDIA will launch a \"new inference chip\" incorporating Groq LPU design at next month's GTC conference" description: "NVIDIA's upcoming inference chip system integrates Groq's \"Language Processing Unit\" (LPU) technology, utilizing an architecture that is fundamentally different from traditional GPUs. It is optimized " type: "news" locale: "en" url: "https://longbridge.com/en/news/277271740.md" published_at: "2026-02-28T03:58:05.000Z" --- # Report: NVIDIA will launch a "new inference chip" incorporating Groq LPU design at next month's GTC conference > NVIDIA's upcoming inference chip system integrates Groq's "Language Processing Unit" (LPU) technology, utilizing an architecture that is fundamentally different from traditional GPUs. It is optimized specifically for latency and memory bandwidth bottlenecks in large model inference through broader SRAM integration and 3D stacking technology. This new product may be based on the next-generation Feynman architecture design, significantly reducing the energy consumption and costs of AI agents. OpenAI has committed to purchasing and investing $30 billion NVIDIA plans to release a new inference chip integrated with Groq's "Language Processing Unit" (LPU) technology at next month's GTC developer conference, representing NVIDIA's acceleration towards the field of inference computing to meet the urgent demand from customers for efficient and low-cost computing solutions. According to The Wall Street Journal, this new system, described by NVIDIA CEO Jensen Huang as "something the world has never seen," is designed specifically to accelerate query responses for AI models. The launch of this product is expected to reshape the current AI computing power market landscape, directly impacting cloud service providers and enterprise investors seeking more cost-effective alternatives. As an important sign of preliminary market recognition of this technology, ChatGPT developer OpenAI has agreed to become one of the largest customers of this new processor and announced plans to purchase large-scale "dedicated inference capacity" from NVIDIA. This move not only solidifies NVIDIA's core customer base but also sends a clear signal to the market: **the underlying infrastructure supporting autonomous AI agents is shifting from large-scale pre-training to efficient inference.** In the face of fierce competition from Google, Amazon, and numerous startups, NVIDIA is breaking away from its traditional reliance on graphics processing units (GPUs). By introducing new technological architectures and exploring pure central processing unit (CPU) deployment models, the company aims to continue consolidating its market dominance in the next phase of AI industry evolution. ## Integrating LPU Design to Address Large Model Inference Bottlenecks As the AI industry shifts from model training to actual application deployment, inference computing has become the core focus. AI inference is mainly divided into two stages: pre-fill and decode, with the decoding process of large AI models being particularly slow. To address this technical bottleneck, NVIDIA has chosen to break through physical limits through external technology integration. According to The Wall Street Journal, NVIDIA spent $20 billion at the end of last year to acquire key technology licenses from the startup Groq and brought in an executive team, including founder Jonathan Ross, in a large-scale "core hiring" deal. **The "Language Processing Unit" (LPU) designed by Groq features an architecture that is fundamentally different from traditional GPUs, demonstrating extremely high efficiency in processing inference functions.** Industry analysts believe **that the upcoming product may involve a disruptive next-generation Feynman architecture.** According to a previous article from Wallstreetcn, the Feynman architecture may adopt a more extensive SRAM integration scheme and even deeply integrate the LPU through 3D stacking technology, specifically optimizing for the two major inference bottlenecks of latency and memory bandwidth, thereby significantly reducing the energy consumption and costs of AI agents' operations ## Expanding Pure CPU Deployment to Provide Diverse Computing Options While introducing the LPU architecture, NVIDIA is also flexibly adjusting the way it uses its traditional processors. NVIDIA's previous standard practice was to bundle the Vera CPU with its powerful Rubin GPU in data center servers, but this configuration has proven to be too costly and inefficient in terms of energy when handling certain specific AI agent workloads. Some large enterprise clients have found that pure CPU environments are more efficient for running specific AI tasks. In line with this trend, NVIDIA announced this month an expansion of its collaboration with Meta Platforms, conducting its first large-scale pure CPU deployment to support Meta's advertising-targeted AI agents. This collaboration is seen by the market as an early window into NVIDIA's strategic adjustment, **indicating that the company is moving beyond a single GPU sales model and attempting to lock in different segments of the AI market through a diversified hardware combination.** ## Market Demand Shifts Gears, Competitive Landscape Continues to Escalate The evolution of this underlying hardware design is directly driven by the explosion of demand for AI agent applications in the tech industry. **Many companies building and operating AI agents have found that traditional GPUs are too expensive and not the best choice for running models in practice.** OpenAI's movements highlight this trend. **In addition to committing to purchase NVIDIA's new systems to improve its rapidly growing Codex tool, OpenAI also reached a multi-billion dollar computing partnership with startup Cerebras last month.** According to Cerebras CEO Andrew Feldman, its inference-focused chips outperform NVIDIA's GPUs in speed. Additionally, OpenAI has signed a significant agreement to use Amazon's Trainium chips. **Not only startups, but major cloud service providers are also accelerating their self-developed chip efforts.** Widely regarded as a leader in the automated coding market, Anthropic Claude Code currently relies primarily on chips designed by Amazon AWS and Alphabet's Google Cloud, rather than NVIDIA's products. In the face of competitors' encroachment, Jensen Huang emphasized in an interview with wccftech that NVIDIA is transforming from a pure chip supplier to a builder of a complete AI ecosystem encompassing semiconductors, data centers, cloud, and applications. For investors, next month's GTC conference will be a key moment to test whether NVIDIA can maintain its 90% market share myth in the era of inference ### Related Stocks - [NVDA.US - NVIDIA](https://longbridge.com/en/quote/NVDA.US.md) - [OpenAI.NA - OpenAI](https://longbridge.com/en/quote/OpenAI.NA.md) - [XSD.US - SPDR S&P Semicon](https://longbridge.com/en/quote/XSD.US.md) - [PSI.US - Invesco Semiconductors ETF](https://longbridge.com/en/quote/PSI.US.md) - [SOXX.US - iShares Semiconductor ETF](https://longbridge.com/en/quote/SOXX.US.md) - [SOXL.US - Direxion Semicon Bull 3X](https://longbridge.com/en/quote/SOXL.US.md) - [NVDL.US - GraniteShares 2x Long NVDA Daily ETF](https://longbridge.com/en/quote/NVDL.US.md) - [NVDX.US - T-Rex 2X Long NVIDIA Daily Target ETF](https://longbridge.com/en/quote/NVDX.US.md) - [SMH.US - VanEck Semiconductor ETF](https://longbridge.com/en/quote/SMH.US.md) - [XLK.US - Spdr Select Tech](https://longbridge.com/en/quote/XLK.US.md) ## Related News & Research | Title | Description | URL | |-------|-------------|-----| | 面对英伟达 75% 的利润率,AMD 们压力山大! | 英伟达利润率创下 2024 年下半年以来新高。但存储价格上涨、AMD 与 Alpahbet 的低价竞争,以及下游云计算企业尚未能将算力转化为收入,使得市场担忧英伟达能否守住高利润率。黄仁勋回应英伟达 GPU 的通用性和能效领先竞争对手,同时 | [Link](https://longbridge.com/en/news/277046179.md) | | AI 冲击之下,新一轮 “次贷危机” 来了? | AI 冲击正引发信用债市场动荡,类似次贷危机的风险传导路径初现:软件行业贷款暴跌导致杠杆贷款指数创近三年最大跌幅,高达 1500 亿美元的 CLO 底层资产面临 AI 颠覆风险。但当前冲击仍集中在科技行业,尚未演变为系统性违约潮。 | [Link](https://longbridge.com/en/news/277271099.md) | | 不再单押英伟达,Meta 斥资数十亿美元租用谷歌 TPU | Meta 与谷歌达成数十亿美元协议,租用谷歌 TPU 以开发新 AI 模型,标志着 AI 算力供应商多元化的进展。尽管 Meta 仍计划采购数百万英伟达 GPU,但此举显示其希望减少对单一供应商的依赖。Meta 还与 AMD 达成合作,主要 | [Link](https://longbridge.com/en/news/277123562.md) | | 英伟达(NVIDIA)第四季度的亮眼表现未能打动投资者——群众智慧揭示了什么 | 英伟达(Nvidia,股票代码:NVDA)公布了 2026 财年第四季度的强劲业绩,每股收益增长 82%,达到 1.62 美元,收入增长 73%,达到 681 亿美元。尽管这些结果令人印象深刻,但投资者情绪依然谨慎,股价仅上涨 1%。分析师 | [Link](https://longbridge.com/en/news/277041644.md) | | 今日股票:英伟达是否出现了假突破? | 英伟达(NVDA)股票正经历潜在的虚假突破,暗示未来可能出现看跌趋势。尽管报告的收益超出预期,但该股票的交易价格仍在下跌。技术分析显示出经典的上升三角形模式,通常被视为看涨,但市场动态可能导致反转。交易者需谨慎,因为股票突破阻力位并不保证会 | [Link](https://longbridge.com/en/news/277087607.md) | --- > **Disclaimer**: This article is for reference only and does not constitute any investment advice.