
TPU vs GPU: Google's chip commercialization accelerates, can NVIDIA's moat hold up?

Although a single TPU is not as powerful as the strongest GPU, Google is leveraging its ultra-large-scale clusters and higher cost-effectiveness to challenge NVIDIA's pricing power and market control. The real battleground lies in ecosystems and business models—NVIDIA locks in users with CUDA, while Google opens new entry points with TPU + Gemini. NVIDIA has a clear advantage in versatility and ecosystem maturity, but as more leading clients begin to "test the waters" with TPUs, any slight loosening will be quickly amplified by the market
When Google began attempting to sell its self-developed AI chip TPU (Tensor Processing Unit) to a broader market, this "chip cold war," which originally only occurred in the cloud, is being brought to the forefront and poses a substantial challenge to AI chip leader NVIDIA.
A recent article by tech media The Information analyzes that NVIDIA cannot ignore the fact that the two most advanced AI models in the world—those from Google and Anthropic—are developed using Google’s self-developed TPU chips, either fully or partially, rather than NVIDIA’s GPUs. This reality has prompted one of NVIDIA's largest clients, Meta, to seriously consider using Google’s TPU to develop new models.
This means that the role of TPU has upgraded from "Google's internal tool" to an alternative solution that can be seriously considered by large AI companies. According to previous analysis by Morgan Stanley, Google plans to produce more than 3 million TPUs in 2026 and about 5 million in 2027, while NVIDIA's current GPU output is approximately three times that of Google’s TPU.
Although a single TPU is not as powerful as the strongest GPU, Google is leveraging ultra-large-scale clusters and higher cost-effectiveness to challenge NVIDIA's pricing power and market control. The real battlefield lies in ecology and business models—NVIDIA locks in users with CUDA, while Google opens new entry points with TPU + Gemini. NVIDIA has a clear advantage in versatility and ecological maturity, but as more leading clients begin to "test the waters" with TPU, any slight loosening will be quickly amplified by the market.
Performance Comparison: Single Chip Loses, System Wins?
From a purely computational power perspective, the most advanced TPU (codenamed Ironwood) has about half the floating-point operations per second (FLOPS) of NVIDIA's Blackwell GPU.
However, this does not mean that TPU is at a disadvantage.
The Information states that Google's strategy is to amplify performance advantages through "clustering": thousands of TPUs can be linked together to form a "super pod," providing excellent cost-effectiveness and energy efficiency when training ultra-large models. In contrast, NVIDIA's single system can directly connect a maximum of about 256 GPU chips, although users can expand the scale through additional networking equipment.
In the era of large models, it has become increasingly difficult to simply determine the winner based on "single-chip performance"; system-level design, interconnect capabilities, and energy efficiency are becoming the new core metrics.
Key Difference: Software Ecosystem Remains NVIDIA's Moat
What truly constitutes NVIDIA's "moat" is not just hardware, but the deeply integrated CUDA software ecosystem.
The Information article states that for clients already using NVIDIA's CUDA programming language to run AI, renting NVIDIA chips is more cost-effective. Developers with the time and resources to rewrite programs can save costs by using TPU.
For technically advanced TPU clients like Anthropic, Apple, and Meta, the challenges of using TPU are relatively small, as they are better at writing server chip software for AI applications. TPU is particularly cost-efficient when running Google’s Gemini model, which is optimized for it However, software compatibility remains a major challenge for TPU. TPU can only work smoothly with specific AI software tools like TensorFlow, while PyTorch, which is used by most AI researchers, performs better on GPUs. Several engineers have stated that if developers spend time writing custom software to fully utilize GPUs, their performance may surpass that of TPUs.
Cost Battle: TPU is Not "Cheap"
In terms of manufacturing costs, TPU and GPU are actually not that different. Ironwood uses a more advanced and expensive process technology than Blackwell, but due to the smaller chip size, more TPUs can be cut from the same wafer, partially offsetting the cost disadvantage.
Both use high bandwidth memory (HBM), and in terms of process and packaging, Broadcom plays an extremely critical role—not only participating in packaging design but also providing key IP such as SerDes (high-speed data transmission core technology). Analysts estimate that Broadcom's earnings from the TPU project have reached at least $8 billion.
It is worth noting that NVIDIA's current hardware business gross margin is as high as 63%, while Google Cloud's overall is only 24%. This also explains why NVIDIA can maintain strong profitability even in a price war.
Capacity Game: TSMC's "Balancing Act"
In the wafer foundry sector, TSMC does not bet all its capacity on a single customer. Even with extremely high demand from NVIDIA, it is difficult to obtain "unlimited supply." This means that there will always be space in the market for other solutions—including TPU.
According to Morgan Stanley's forecast, Google plans to produce 3 million TPUs in 2026, reaching 5 million by 2027, and possibly even higher. Currently, the output of NVIDIA GPUs is about three times that of TPUs, and the gap is narrowing.
As supply begins to diversify, customers will naturally be more willing to compare, negotiate, and spread risks.
Commercialization Dilemma: Selling Chips is Harder Than Expected
The Information believes that if Google really wants to sell TPUs on a large scale, it needs to almost rebuild an entire industry chain—including server manufacturers, distribution networks, enterprise-level after-sales support, etc., which is essentially "replicating NVIDIA."
Moreover, if customers deploy TPUs in their own data centers, Google will lose some cloud service revenue (such as storage, database services, etc.), which means that in the future, TPUs are unlikely to take the "low-price route," but rather make up for revenue gaps through other fees.
In other words, this is not a business where "cheap can win," but a complex strategic choice.
From a higher dimension, the significance of TPU to Google is not just in hardware revenue itself. More importantly: it can become a bargaining chip in negotiations with NVIDIA; it helps promote Gemini and its AI ecosystem, giving Google greater autonomy in AI infrastructure. As long as customers are willing to have "one more option," NVIDIA will no longer have absolute pricing power. This may be what Google truly wants

