What is Google's Ironwood AI accelerator chip?

BigTech in-house chips are coming to fruition for the Age of Inference and era of Agents. BigTech Clouds are boasting their own chips as they mature for a new era of functional and useful AI.

Apr 15, 2025

∙ Paid

Hey Everyone,

Amazon has been more or less successfully developed and deployed its own custom AI chips, including Trainium for training large models and Inferentia for inference, to reduce reliance on Nvidia GPUs and gain cost advantages in the cloud. Trainium 2 certainly has some potential.

So what about Google’s Ironwood? Ironwood is Google’s first chip for the age of interference that they announced on April 9th, 2025.

“Age of Inference”

It’s Google’s seventh generation of its custom TPU architecture. The chip, known as Ironwood, was reportedly designed for the emerging needs of Google's most powerful Gemini models, like simulated reasoning, which Google prefers to call "thinking." The company claims this chip represents a major shift that will unlock more powerful agentic AI capabilities. Google calls this the "age of inference."

At Google Cloud Next they announced some interesting things:

Scheduled to launch sometime later this year for Google Cloud customers, Ironwood will come in two configurations: a 256-chip cluster and a 9,216-chip cluster.
Google is claiming its Cloud is the only AI-optimized platform. I get the sales pitch: Google says it’s the most powerful TPU accelerator the company has ever built, and it can scale to a megacluster of 9,216 liquid-cooled chips linked together by its advanced Inter-Chip Interconnect technology. In this way, users can combine the power of many thousands of Ironwood TPUs to tackle the most demanding AI workloads. Fair enough.

Ironwood: The first Google TPU for the age of inference

Keep reading with a 7-day free trial

Subscribe to Semiconductor Reports ™ to keep reading this post and get 7 days of free access to the full post archives.