--- title: "Two $20 Billion Deals: OpenAI and NVIDIA Engage in an 'Inference War'" type: "News" locale: "en" url: "https://longbridge.com/en/news/283208809.md" description: "Two transactions totaling approximately $20 billion each have triggered a reconstruction of AI computing power: NVIDIA acquires Groq to plug gaps in inference chip capabilities, while OpenAI makes massive purchases and invests in Cerebras, participating in its IPO. As AI computing shifts from training to inference, the competition over chip architecture escalates into a struggle for industry control, with inference infrastructure emerging as the core battleground of the new AI war" datetime: "2026-04-18T03:06:16.000Z" locales: - [zh-CN](https://longbridge.com/zh-CN/news/283208809.md) - [en](https://longbridge.com/en/news/283208809.md) - [zh-HK](https://longbridge.com/zh-HK/news/283208809.md) --- # Two $20 Billion Deals: OpenAI and NVIDIA Engage in an 'Inference War' In December 2025, NVIDIA quietly spent $20 billion to acquire Groq, an AI chip company. On April 17, 2026, OpenAI announced it would purchase more than $20 billion worth of chips from another AI chip company, Cerebras. On the same day, Cerebras formally filed for an IPO on Nasdaq, targeting a valuation of $35 billion. **Two sums, nearly identical amounts. One is an acquisition; the other is a procurement. One comes from the world's largest AI chip seller; the other from the world's largest AI buyer.** These are not two independent events but two symmetrical moves in the same war. The battlefield is named: AI Inference. The vast majority of people have not noticed this war. It has no explosions, only rows of financial announcements and technical discussions circulating within the Silicon Valley engineer community. Yet its impact may be more profound than any AI launch event in the past two years—because it is redistributing control over what is almost certain to become the largest tech market in history. ## What is Inference, and Why "Training" Is No Longer the Keyword of 2026 Before discussing the two $20 billion figures, we must first understand the context: the battlefield of AI chips is undergoing a shift in focus. Training and inference are the two stages of AI computing consumption. Training builds models—feeding massive amounts of data into neural networks so they learn specific capabilities. This process typically occurs once or is updated periodically. Inference uses models—every time a user asks a question and ChatGPT provides an answer, that represents one inference request. In 2023, the bulk of global AI computing expenditure was on training, with inference playing a secondary role. But this ratio is rapidly reversing. According to market research data from Deloitte and CES 2026, inference accounted for 50% of all AI computing expenditures in 2025; by 2026, this proportion will jump to two-thirds. Lenovo CEO Yang Yuanqing stated even more plainly at CES: the structure of AI spending will completely flip from "80% training + 20% inference" to "20% training + 80% inference." The logic is not complex. Training is a one-time cost, while inference is a recurring cost. GPT-4 was trained once, but it answers questions from hundreds of millions of users daily; every conversation constitutes an inference request. After large-scale deployment, the cumulative consumption of inference far exceeds that of training. What does this mean? It means the most profitable segment of the AI industry—the cake—is shifting from "training chips" to "inference chips." And these two types of chips require entirely different architectural designs. ## NVIDIA's Dilemma: Chips Designed for Training Are Naturally Poor at Inference NVIDIA's H100 and H200 are monsters designed for training. Their core advantage lies in extremely high computational throughput—training requires massive multiplication operations on huge matrices, a task where GPUs excel at "multi-core parallel computation." However, the bottleneck in inference is not computation; it is memory bandwidth. When a user sends a question, the chip must "move" the entire model weights from memory to the compute units before generating an answer. This "moving" process is the true source of inference latency. NVIDIA's GPUs use external High Bandwidth Memory (HBM); this step inevitably introduces latency—for ChatGPT, which processes tens of millions of requests per second, this latency, when multiplied by scale, becomes a genuine performance bottleneck. OpenAI's internal engineers noticed this problem while optimizing Codex (a code generation tool). They found that regardless of parameter tuning, response speeds were constrained by the architectural limits of NVIDIA GPUs. In other words, NVIDIA's disadvantage in inference is not a matter of effort; it is an architectural issue. Cerebras' WSE-3 chip took a completely different route. This chip is so large it requires wafer-level packaging—an area of 46,255 square millimeters, larger than a human palm—integrating 900,000 AI cores and 44GB of ultra-fast SRAM memory onto a single silicon die. Memory is placed directly adjacent to the compute cores, shortening the "moving" distance from centimeters to micrometers. Result: inference speed is 15 to 20 times faster than NVIDIA's H100. It should be noted that NVIDIA is not sitting idle. Its latest Blackwell (B200) architecture offers four times the inference performance of the H100 and is being deployed on a large scale. But Blackwell is chasing a moving target—Cerebras is iterating simultaneously, and the chip market now hosts competitors beyond just Cerebras. ## NVIDIA's $20 Billion: A Confession Behind History's Largest Acquisition On December 24, 2025, NVIDIA announced its largest acquisition in history. The target was Groq. Groq is a direct competitor to Cerebras, specializing in SRAM-architecture chips optimized for inference—its product, called LPU (Language Processing Unit), was then the fastest chip service in public benchmarks for inference speed. NVIDIA paid $20 billion to acquire Groq's core technology and founding team, including founder Jonathan Ross and several top-tier chip engineers from Google's TPU team. This is NVIDIA's largest acquisition since purchasing Mellanox for $7 billion in July 2019—triple the size. To many analysts, the message conveyed behind this sum is far more important than the amount itself: NVIDIA believes it has a structural gap in inference, and this gap is so significant that it warrants a $20 billion investment to fill it. If NVIDIA truly believed its GPUs were invincible in inference, it would not need to acquire Groq. This money essentially represents a $20 billion technical procurement order—a recognition that embedded SRAM architecture holds genuine technical advantages in inference scenarios, and that NVIDIA cannot naturally cover this advantage with its existing product lines. It purchased, at the highest possible price, a technological gap it could not fill on its own. Of course, NVIDIA's official narrative after the acquisition is different: "Deep integration with Groq to provide more complete inference solutions." The translation of this technical language is: We realized our offerings were insufficient, so we bought someone else's. ## OpenAI's $20 Billion: Buying Chips Is Just the Surface; Equity Investment Is Key Now let's return to OpenAI. In January 2026, OpenAI signed a three-year, $10 billion compute procurement agreement with Cerebras—at the time, media reports focused on "OpenAI diversifying its chip suppliers," treating the matter lightly. However, details revealed on April 17 fundamentally changed the nature of this deal: First, the procurement amount doubled from $10 billion to $20 billion. Second, OpenAI will receive warrants for Cerebras; as the procurement scale increases, its stake could reach up to 10% of Cerebras' total shares. Third, OpenAI will also provide $1 billion in funding for Cerebras' data center construction—in other words, OpenAI is helping Cerebras build factories. Viewed together, these three details paint a completely different picture: OpenAI is not merely buying chips; it is incubating a supplier. This logic has clear precedents in tech history. In 2006, Apple began collaborating with Samsung to customize A-series chips. Initially, it was a bulk procurement agreement, but as Apple deepened its involvement and eventually developed its own M-series chips, supply chain control shifted entirely from Intel and Samsung to Apple itself. What OpenAI is doing is somewhat similar—but with an important boundary: Apple held chip design rights from the start, whereas OpenAI remains a purchaser today. Even after Cerebras goes public, it will continue to develop independently and serve more clients. The endpoint of this path may not be OpenAI fully controlling Cerebras; rather, it is likely both parties establishing a deeply interdependent ecosystem. On one hand, by tying Cerebras through $20 billion in purchases and equity stakes, OpenAI ensures a continuous supply of non-NVIDIA inference computing power. On the other hand, OpenAI is collaborating with Broadcom to develop its own ASIC chips, expected to enter mass production by the end of 2026. Walking on two legs simultaneously, the destination is computing autonomy. ## What You're Buying When Cerebras Goes Public Today On April 17, Cerebras officially submitted its Nasdaq IPO application, targeting a valuation of $35 billion and planning to raise $3 billion. This valuation represents more than a fourfold increase from September 2025, when it was valued at $8.1 billion. Following a new financing round completed earlier this year, its valuation had already risen to $23 billion; the IPO target of $35 billion thus adds a 52% premium on top of that. Those familiar with Cerebras' history know this is its second attempt at going public. The first, in 2024, was withdrawn because its core customer G42 (UAE sovereign tech investment fund) accounted for 83% to 97% of that year's revenue. CFIUS intervened on national security grounds, forcing the withdrawal. This time, G42 has disappeared from the shareholder list, replaced by OpenAI. In other words, Cerebras' structural issue of customer concentration has not been fundamentally resolved—the name of the major client has changed, but the dependence on a single large client remains. Investors must decide: is this new client better or worse? From a credit perspective, OpenAI is clearly superior to G42. From a strategic perspective, however, OpenAI is also the incubator of Cerebras' competitor—if its self-developed ASIC matures, it poses a genuine substitution threat to Cerebras. For fairness, Cerebras is actively expanding its customer base. The prospectus is expected to list more diversified revenue sources, improving concentration ratios. However, until OpenAI's self-developed chips enter mass production, the answer to this question remains unresolved. Buying Cerebras stock means you are simultaneously betting on two things: that OpenAI will continue to choose Cerebras, and that OpenAI's self-developed ASIC will not arrive prematurely. Neither of these is certain. Of course, the bullish arguments are real: if the inference market grows according to forecast trajectories, even a small share of this market would yield substantial absolute numbers for Cerebras. The issue is not whether Cerebras has opportunities, but whether the $35 billion pricing already reflects the most optimistic scenario. Two $20 billion deals appeared symmetrically between late 2025 and April 2026. One came from the world's largest AI chip seller, acquiring the technology of an inference market competitor. One came from the world's largest AI buyer, incubating a company challenging NVIDIA in the inference market. NVIDIA's $20 billion was defensive—it used the highest price to plug a technological gap it could not fill itself. OpenAI's $20 billion was offensive—it is burning cash to build an inference highway independent of NVIDIA, while securing warrants for a toll station along that road. This war has no gunfire, but capital flows never lie. These two sums tell you more clearly than any AI launch event: control over AI inference infrastructure is being contested. And this market will account for two-thirds of total industry computing expenditures in 2026. Cerebras' IPO is the bugle call sounded by this war. ### Related Stocks - [OpenAI.NA](https://longbridge.com/en/quote/OpenAI.NA.md) - [SOXL.US](https://longbridge.com/en/quote/SOXL.US.md) ## Related News & Research - [852 Billion Valuation at Risk? OpenAI’s Strategic Shift Faces Probing Questions From Investors](https://longbridge.com/en/news/282660742.md) - [OpenAI loses 3 top executives as it cuts back on 'side quests'](https://longbridge.com/en/news/283203174.md) - [OpenAI Just Made Codex a Lot More Dangerous for Anthropic](https://longbridge.com/en/news/283156116.md) - [Anthropic and OpenAI tighten security as AI models show advanced hacking ability](https://longbridge.com/en/news/282878472.md) - [OpenAI has bought AI personal finance startup Hiro](https://longbridge.com/en/news/282604103.md)