Trending:
AI & Machine Learning

Amazon bets $125B on custom AI chips to cut AWS costs, defend cloud lead

AWS is pushing Trainium3 chips to slash AI inference costs as Microsoft and Google gain ground. The strategy: custom silicon beats Nvidia's premium pricing. Quarterly results Thursday will show if it's working.

Amazon's cloud future hinges on whether its custom AI chips can deliver what Nvidia charges premium prices for: cheaper compute.

AWS unveiled Trainium3 in December, claiming 4x training performance over the previous generation. The goal is straightforward: reduce reliance on Nvidia's expensive GPUs while offering customers lower AI workload costs. It's the same playbook Microsoft (Maia 200) and Google (TPUs for Gemini 3) are running.

The stakes are clear in the growth numbers. AWS is projected to grow 19.1% this year to $177.78 billion. Azure is tracking 26.1% growth to $120.85 billion. Google Cloud: 32% to $57.19 billion. AWS hit 20.2% growth in Q3, first time above 20% since 2022. The stock jumped 9.6% on that news, then gave back 8.5% as investors waited for confirmation.

According to AWS VP David Brown, price performance is now the metric customers care about most. "If they can find a chip that allows them to get more performance for fewer dollars, that's a strategic advantage," he told CNBC. Translation: enterprises are optimizing AI spend, and AWS needs to be the cheapest option that works.

The custom chip push started with AWS acquiring Annapurna Labs in 2015. Since then: Graviton CPUs, Trainium for training, Inferentia for inference. All designed to run specific workloads more efficiently than general-purpose Nvidia hardware.

Brad Gastwirth, formerly Wedbush's chief technology strategist, puts it plainly: "Nvidia is charging astronomical numbers. If you build something specific for your needs, you can save a tremendous amount of money."

Nvidia CEO Jensen Huang isn't concerned, telling Jim Cramer that Nvidia's versatility addresses "markets much broader than chatbots." He's right that ASICs trade flexibility for efficiency. The question for AWS: is efficiency enough when you're spending $125 billion annually on AI infrastructure?

Q4 results Thursday will show whether the Trainium bet is translating to revenue reacceleration. For CTOs evaluating cloud spend, the takeaway is simple: custom silicon is making AI inference materially cheaper. The hyperscalers building it fastest will win enterprise workloads.

Worth noting: this isn't just about training costs. SageMaker users should be evaluating Inferentia instances against GPU pricing for production inference workloads. The gap is significant, particularly for models that don't require Nvidia's full feature set.