Amazon Fell Behind on AI. Can It Stage a Comeback?

When OpenAI took off, it was Microsoft’s moment. With Gemini’s rise, Google Cloud surged. So here is the question: as Anthropic stands out, which cloud vendor will benefit.

Amazon, which has been a step slow in chips, model R&D, and GPU procurement, will it finally get its turn in AI. The opportunity window may be opening.

1) What does Amazon AWS’s current AI stack look like, given its perceived laggard status.

2) How material is the partnership with Anthropic for Amazon, and how durable is it.

3) What is Amazon’s true in-house chip capability.

4) How far has Amazon come on foundation models, its weakest link.

Dolphin Research takes a closer look at each point.

I. Is Amazon’s AI finally ramping.

1) A closer look at AWS revenue mix

In Q1 2026, Amazon’s AI annualized revenue topped $15 bn, about 10% of AWS revenue. Back in Q1 2024, management vaguely indicated low single-digit billions annualized. AWS’s ~10% AI mix still trails Azure’s 20%+, which remains meaningfully higher.

Before breaking down AI, start with AWS’s core revenue layers. The stack is classically split into IaaS, PaaS, and SaaS/other.

a. IaaS: compute chips, storage (drives), network bandwidth and other basic resources, with EC2 as the flagship. This is entry-level virtualized infrastructure rental, so margins are low.

b. PaaS/middleware: databases, data analytics, orchestration/management, and security tooling. Margins are higher at this layer.

c. SaaS apps & other: first-party or third-party software, plus vertical solutions, IoT, etc. This is the application layer and adjacent services.

AWS has long been IaaS-heavy and still is at ~60%. PaaS has risen to ~30%, while SaaS remains single-digit as a share.

2) AI that cuts across the stack

On top of the three-layer cake sits the ‘+1’ AI revenue line, which spans across IaaS/PaaS/SaaS. Market estimates suggest AWS AI includes several parts.

a. Primarily compute rental (IaaS): unlike traditional compute, AI rentals are generally based on Trainium/Inferentia or Nvidia GPUs. Customers also skew to large AI labs and big tech, who consolidate spend and negotiate hard.

Because AI chips and storage are far more costly, and buyers have stronger bargaining power, cloud providers earn thin margins on AI IaaS. It is lower than traditional cloud IaaS margins.

b. The second major piece is Bedrock — AWS’s MaaS/TaaS (Token-as-a-Service) platform. Instead of renting bare metal, Amazon deploys frontier models and sells model APIs/tokens directly.

c. Other pieces include SageMaker and Amazon Q. SageMaker is a pre-built platform for AI/ML training, debugging, and deployment, enabling self-training or fine-tuning, plus post-training inference.

Amazon Q is an end-user-facing AI agent suite, with Developer, Business, and Connect editions. These target developers, enterprise staff, and customer service teams respectively.

Like Bedrock, neither directly rents bare hardware. They layer services above infrastructure, yielding structurally higher margins.

AWS is growing fast with margins holding up; beyond Anthropic’s usage surge, the core reasons are:

a. AWS’s AI mix is smaller, but MaaS/TaaS is a larger share of that mix. Per Semi Analysis, Bedrock contributes ~37% of AWS AI revenue, while ~80% of Azure and GCP AI revenue still comes from IaaS-only hardware rental.

b. In absolute MaaS/TaaS revenue, AWS leads at ~$5.5 bn, with Google Cloud slightly below $5 bn and Azure sub-$2 bn. Amazon’s MaaS/TaaS OPM is ~55% vs. AWS’s overall OPM below ~38% in Q1 2026.

c. Newer clouds like Oracle and CoreWeave are generally weaker on software. They mostly do ‘bare metal’ rentals with low value-add and low margins, so their MaaS/TaaS revenue is negligible vs. the Big Three.

II. Deep ties with AI labs are a key edge

From the compute pillar, we see AWS’s AI composition and why MaaS/TaaS is becoming the main thrust. The crux of MaaS/TaaS competitiveness is twofold.

First, model depth and breadth on the platform — do you have current SOTA models, and enough variety across types and tiers. Second, relative cost and pricing for similar models vs. peer clouds — driven by in-house chips and engineering to lower unit compute cost.

The first maps to in-house model R&D and strong third-party model partnerships. The second maps to chip design and systems engineering to push down cost per token.

Amazon has emphasized a platform approach and underinvested in frontier models, with Nova roughly around Haiku 4.5 to Sonnet 4.5 by capability. It therefore relies heavily on external AI labs to bolster model supply.

In fact, most Bedrock API/token sales are third-party model-based, with Claude as the main driver today. After striking a deep collaboration with OpenAI, GPT API/token volume on Bedrock will likely rise meaningfully too.

So let’s unpack the Amazon–Anthropic partnership in detail.

1) The Anthropic partnership

Amazon’s first deep AI lab partner was Anthropic. Claude landed on Bedrock in Apr 2023, and the formal tie-up came in Sept 2023, which then evolved across three phases.

a. Sept 2023: In the initial phase, Amazon committed up to $4 bn (funded across three tranches by May 2024). In return, Anthropic named AWS its primary cloud provider, would increasingly use Trainium/Inferentia for training and inference (reportedly shifting from TPU and Nvidia GPUs), and made Claude broadly available on Bedrock.

b. Nov 2024: Amazon added another $4 bn, totaling $8 bn, and the partnership deepened into co-design across chips and models. On hardware, Anthropic began working directly with Annapurna Labs (Amazon’s chip design arm) on Trainium. On software, Claude’s kernels were optimized for Trainium and its instruction set.

During this phase, the two also unveiled Project Rainier — a mega-scale compute campus centered on Trn chips. This will be detailed later.

c. Apr 2026: Amazon invested another $5 bn, totaling $13 bn, and holds rights to invest up to $20 bn more.

Anthropic committed to spend $100 bn over 10 years on AWS and use 5 GW of Trainium capacity, including Trn2 in production and future Trn3 & Trn4.

Notably, 1 GW of compute generally maps to slightly over $10 bn in annual revenue today. The implied revenue per GW under this 5 GW commitment is much lower than that heuristic.

Partly, utilization ramps over time and won’t hit 5 GW on day one. Anthropic also noted it may use non-Trainium compute outside AWS, but the deal still suggests a meaningfully lower all-in cost for Trn vs. Nvidia GPU-based stacks.

2) How much revenue does Anthropic drive for AWS.

As Amazon’s most important AI partner, Anthropic contributes in two ways. The larger piece is Anthropic’s own training/inference spend on AWS, and the other is Bedrock’s distribution commissions from Claude API/token sales.

Based on press data, Anthropic’s compute spend was about 1%, 3%, and 8–9% of AWS revenue for 2024, 2025, and Q1 2026, respectively. While not huge as a total, in Q1 2026 specifically, Anthropic likely accounted for 80%+ of AWS AI revenue directly.

For Bedrock’s distribution of Claude, total sales are booked by Anthropic with Bedrock as a channel. AWS recognizes only the commission on that gross, which is smaller in absolute dollars but very high margin given minimal variable cost.

Overall, the vast majority of AWS AI revenue today is directly or indirectly driven by Anthropic. Anthropic’s ARR trajectory therefore provides a strong read-through for AWS AI growth acceleration.

(Note: when customers buy Claude API/tokens on Bedrock, the underlying compute is very likely AWS. That hardware rental is captured as Anthropic’s inference spend within AWS IaaS.)

c. There is currently no Microsoft–OpenAI-style revenue sharing tied to equity between Amazon and Anthropic. The structure is more straightforwardly commercial.

d. Bottom line, most AWS AI revenue today is Anthropic-driven, so Anthropic’s ARR momentum is a key lead indicator for AWS. This linkage is unusually tight for now.

As a simple exercise, Anthropic’s ARR peaked in Mar–Apr, then its MoM growth slowed in May. As of May, ARR was about $45 bn; assuming monthly net adds trend down conservatively, ARR could reach just over $70 bn by end-2026.

Under a simplified assumption that all AWS AI revenue is Anthropic-driven, and non-AI AWS grows ~16% in 2026 (vs. ~14.4% last year), Anthropic could represent ~19% of total AWS revenue in 2026. That would lift AWS’s 2026 revenue growth to 35%+, vs. ~28% in Q1, broadly in line with our prior model.

Microsoft once surged on tight OpenAI ties, then faded as that relationship loosened. Model makers and model distributors can be highly correlated.

Will Amazon and Anthropic replay Microsoft’s arc. We do not think so.

Both CSPs were early investors in top labs, with similar dollar checks. But there are key differences.

a. MSFT–OpenAI are more deeply equity-bound: Amazon is thought to own only a single-digit % stake in Anthropic, and has no seat on its governing board.

Thus, the MSFT–OpenAI collaboration rests more on equity ties, visible in MSFT’s revenue share from OpenAI and its prior exclusive API distribution rights. Amazon, by contrast, benefits via commercial terms with Anthropic, not revenue-sharing.

b. Open collaboration, deep technical lock-in: Amazon and Anthropic are primarily bound by technology. Anthropic trains heavily on Trainium-class chips, and its core code is co-optimized with Amazon’s ASICs and toolchains.

Anthropic cannot ‘lift-and-shift’ away without pain; migration costs would be high. Meanwhile Microsoft’s in-house chip efforts lag, GPT leans on Nvidia’s ecosystem, so OpenAI depends less on MSFT specifically.

3) What Project Rainier tells us

Project Rainier is a mega-scale compute campus built on Amazon’s Trainium to meet Anthropic’s training/inference needs. Two campuses have been announced — New Carlisle and Northern Indiana — with disclosed plans as follows.

a. New Carlisle is the first site, with $11 bn planned capex and 2.2–2.3 GW capacity. Construction began in Sept 2024, with first production in late Oct 2025 (about 500k Trn2 in the initial wave). Per Wells Fargo, Phase 1 reached full availability in early 2026 at ~1.3 GW, implying ~1.7 mn Trn2.

With Trainium 3 mass production planned for mid-2026, Rainier will deploy both Trn2 and Trn3. Another ~0.9–1.0 GW is expected to be added, much of it likely completed during 2026.

b. Northern Indiana was announced in late 2025 with ~2.4 GW capacity and $15 bn planned capex. Public details are limited, but reports indicate construction kicked off in May.

From these data points, several takeaways stand out. First, the combined announced Rainier capacity is ~4.6–4.7 GW, largely Trainium-based, aligning closely with Anthropic’s Apr commitment to use 5 GW of Trainium.

Thus, Rainier’s build cadence is a barometer for the Amazon–Anthropic relationship. Progress here is a leading signal.

Second, Amazon appears to need ~15–16 months to go from zero to a 1+ GW campus, roughly in line with Oracle’s pace (Abilene Phase 2 at ~1 GW in ~15 months).

Third, based on disclosed capex, Rainier’s unit capex is ~$5–6 bn per GW, far below Nvidia’s oft-cited $50 bn per GW framework. The scope of the $11 bn and $15 bn is unclear — whether it covers just datacenter shells and base infra vs. chips/servers too.

So we cannot simply conclude Trainium’s all-in per-GW build cost is 1/10th of Nvidia GPU stacks. But it is reasonable to infer Trainium’s per-GW all-in build cost is materially lower than Nvidia-based systems.

Fourth, Anthropic’s $100 bn/10-yr spend for 5 GW implies per-GW annual revenue well below the ~$10 bn/GW industry heuristic. One external estimate (as of late 2025) pegs New Carlisle’s 2.2 GW at ~$14 bn/year revenue after discounts, implying ~60–65% of the industry’s per-GW revenue norm.

Together, these suggest Trainium lowers build cost for operators and usage cost for customers vs. Nvidia-based stacks. That said, Trn2 and Trn3’s absolute performance trails — Trn2 is ~60% of H200, Trn3 only slightly above. Lower pricing is therefore logical.

III. How strong are Amazon’s ASICs.

Compute, chips, and models are the three pillars of AI capability. Amazon’s in-house chips are key to binding Anthropic and establishing a cost edge in cloud, so we review its chip roadmap.

2.1 Amazon’s in-house chip lines and timeline

Amazon’s chip story began with the 2015 acquisition of Annapurna Labs. It now runs four parallel tracks — Nitro (control/storage), Graviton (ARM-based general-purpose CPU), Inferentia (inference ASIC), and Trainium (training & inference ASIC) as follows.

1) Nitro / Nitro SSD: Amazon’s first hardware line launched in 2017. Nitro is not customer-rented compute but dedicated control-plane hardware handling virtualization, network, storage, scheduling, and security to offload overhead and improve efficiency, thereby lowering AWS cost structure.

2) Graviton: ARM-based general CPUs, with Gen 1 in 2018. Early gens competed on lower price/watt, handling lighter workloads cost-effectively.

After multiple iterations, Gen 5 (launched late 2025, mass-scale in 2026) is no longer far off mainstream x86 in the same era. Graviton is now widely available to customers and is likely the most deployed Amazon chip today.

3) Inferentia: Initially a dedicated inference ASIC for traditional ML, announced in late 2018, powering search ranking, personalization, and image/speech tasks.

It later targeted LLM inference, but rising inference demands and Trainium’s ability to handle inference mean Inferentia has been partially displaced. Gen 2 launched in late 2022, and there has been no new release since.

4) Trainium: Amazon’s key line in the LLM era. Gen 1 was announced in 2020 (deployed in 2022), originally focused on training traditional ML.

With GenAI becoming mainstream, three generations in five years (Gen 4 in development) have evolved into a training-and-inference workhorse against Nvidia GPUs and Google TPUs.

In short, Nitro is internal, and Inferentia’s role has narrowed. The focus areas are Trainium and Graviton, which we assess next.

2.2 Performance comparisons

The strategic rationale is to reduce dependency on external hardware and drive better perf/watt and perf/$ via hardware–software co-design, ultimately expanding cloud margins. The best comparison is system-level, not chip-only, so we reference AWS instance data.

1) Graviton delivers strong value

On paper, the table compares Graviton generations with rival CPUs in equivalent AWS instances. Max CPU count indicates parallelism, network bandwidth captures external I/O, and EBS bandwidth reflects storage I/O.

Using ‘M’ (general-purpose) instances at max CPU count, Graviton 5 instance specs are broadly comparable to the latest Intel Xeon 6 and AMD EPYC 5th-Gen already in market. However, Graviton 5 hit GA later (late 2025/2026), so it is roughly one generation behind AMD/Intel in cadence.

Notably, Graviton instances often have stronger I/O. For I/O-optimized variants, Graviton 4 can hit 600 Gbps network and 300 Gbps EBS, while comparable EPYC Gen 4 tops out at 300/50 Gbps (Intel is lower).

On realized performance, while Graviton 5 lacks broad public benchmarks, OpenBenchmarking shows Graviton 4 vs. AMD EPYC 9R14 and Intel Xeon 8488 across common tests. Graviton’s avg. score is similar to Xeon 8488C and ~80% of EPYC 9R14.

At then-current instance pricing (late 2024), perf per $ put Graviton 4 first, ~3.4% above EPYC and ~18% above Xeon. This indicates that despite lower absolute peak vs. AMD’s best, Graviton delivers superior ROI vs. prior-gen flagships in AWS, appealing to value-focused workloads.

2) Trainium 4’s potential

Trainium is offered mainly as large UltraServers/clusters for big customers, so public benchmarks, especially for Gen 3/4, are sparse. We rely on disclosed specs and targets.

Single-chip capability indicates: Trn3 theoretical throughput (compute rate and memory bandwidth) just edges past Nvidia H200 (launched late 2023), but is well below Google TPU v7 (mid-2025). On FP8, Trn3 is roughly 55% of TPU v7.

Trn2, now widely deployed, delivers only ~50–60% of Trn3 on paper, so it is not yet truly competitive for flagship training. It fits inference or smaller-scale training better.

By contrast, Amazon’s stated targets for Trainium 4 would surpass TPU v8 and Nvidia B300, trailing only Rubin-based R100 in FP4. If achieved, Trn4 would leap to a leadership class, potentially winning flagship training/inference from top labs and accelerating AWS growth.

To be clear, these are targets on paper. Trn4 has no firm tape-out date yet.

IV. In-house models: early days, large gap to close

Finally, we review Amazon’s weakest pillar — models. Amazon has launched the Nova family, but it trails SOTA by 1–2 major versions. Key observations follow.

a. Late start, slower cadence: Nova Gen 1 arrived in Dec 2024, with Nova 2 only by late 2025. Amazon’s model effort started late and iterates more slowly.

b. Multimodal track: Nova’s strategy favors breadth over specialization. Beyond the main line, Omni handles text, image, and video, while Sonic supports speech, reflecting a multi-version, multimodal approach.

c. Weaker absolute performance, faster response: Based on Amazon’s MMLU-Pro and other metrics, Nova 2 Lite is roughly on par with Gemini 2.5 Flash or Haiku 4.5, while Nova 2 Pro trades blows with Gemini 2.5 Pro or Sonnet 4.5.

So Nova 2’s top model is only comparable to the prior-gen mid-tier from leading labs. That said, it performs better on OCR and RealKIE for image and structured document recognition.

Amazon also emphasizes Nova’s responsiveness — faster time-to-first-token and higher tokens/sec. Nova 2 Lite and Pro both outpace peers on these latency metrics.

d. Given Nova’s current multimodal stance and capability, Nova is best suited to internal enterprise productivity — rapid processing of simpler, repetitive tasks like contracts/invoices, e-comm image search, and AI customer support.

In other words, Nova likely contributes more to cost leverage than to revenue growth for now. The commercial pull will come more from third-party SOTA on Bedrock.

Summary: Amazon shows signs of moving from laggard back to leader in cloud AI. Its lead in MaaS is the standout, and its chip lineup is comprehensive, though still behind Google TPU on current-gen specs and closing fast.

On models, the gap remains, but tight collaboration with Anthropic is a viable bridge in the medium term. Amazon’s AI stack is far from weak overall and is improving.

Dolphin Research will continue to map other clouds’ AI strategies and capabilities, and will ultimately offer an industry view with stock preferences. Stay tuned.

<End>

Dolphin Research on AMZN — prior work:

Earnings takeaways

Apr 30, 2026 call: ‘AMZN (Trans): A Once-in-a-Lifetime AI Investment’

Apr 30, 2026 note: ‘Retail Solid, AI Breaking Through — AMZN Back in the First Tier.’

Feb 6, 2026 note: ‘$200 bn Arms Race — AMZN Ups the AI Ante’

Feb 6, 2026 call: ‘AMZN (Trans): All New Capacity Was Immediately Absorbed’

Oct 30, 2025 call: ‘AMZN (Trans): Compute Capacity to Double Again by 2027’

Oct 30, 2025 note: ‘AWS’s Big Turn — AMZN Finally Comes Through.’

Risk disclosure and disclaimer: Dolphin Research Disclaimer & General Disclosure