Systems

Google unveils TPU v5p pods to accelerate AI training

Need a lot of compute? How does 8,960 TPUs sound?

Wed 6 Dec 2023 // 17:04 UTC

Google revealed a performance-optimized version of its Tensor Processing Unit (TPU) called the v5p designed to reduce the time commitment associated with training large language models.

The chip builds on the TPU v5e announced earlier this year. But while that chip was billed as Google's most "cost-efficient" AI accelerator, its TPU v5p is designed to push more FLOPS and scale to even larger clusters.

Google has relied on its custom TPUs, which are essentially just big matrix math accelerators, for several years now to power the growing number of machine learning features baked into its web products like Gmail, Google Maps, and YouTube. More recently, however, Google has started opening its TPUs up to the public to run AI training and inference jobs.

According to Google the TPU v5p is its most powerful yet, capable of pushing 459 teraFLOPS of bfloat16 performance or 918 teraOPS of Int8. This is backed by 95GB of high bandwidth memory capable of transferring data at a speed of 2.76 TB/s.

As many as 8,960 v5p accelerators can be coupled together in a single pod using Google's 600 GB/s inter-chip interconnect to train models faster or at greater precision. For reference, that's 35x larger than was possible with the TPU v5e and more than twice as large as possible on TPU v4.

Google claims the new accelerator can train popular large language models like OpenAI's 175 billion parameter GPT3 1.9x faster using BF16 and up to 2.8x faster than its older TPU v4 parts — if you're willing to drop floating point for 8-bit integer calculations.

This greater performance and scalability does come at a cost. Each TPU v5p accelerator will run you $4.20 an hour, compared to $3.22 a hour for TPU v4 or $1.20 a hour for TPU v5e. So if you're not in a rush to train or refine your model, Google's efficiency-focus v5e chips still offer better bang for your buck.

Along with the new hardware, Google has introduced the concept of an "AI hypercomputer." The cloud provider describes it as a supercomputing architecture that employs a close knit system of hardware, software, ML frameworks, and consumption models.

"Traditional methods often tackle demanding AI workloads through piecemeal, component-level enhancements, which can lead to inefficiencies and bottlenecks," Mark Lohmeyer, VP of Google's compute and ML infrastructure division, explained in a blog post Wednesday. "In contrast, AI hypercomputer employs system-level codesign to boost efficiency and productivity across AI training, tuning, and serving."

In other words, a hypercomputer is a system in which any variable, hardware or software, that could lead to performance inefficiencies is controlled and optimized for.

Google's new hardware and AI supercomputing architecture debuted alongside Gemini, a multi-modal large language model capable of handling text, images, video, audio, and code.

Topics

Special Features

Vendor Voice

Resources

Systems

Google unveils TPU v5p pods to accelerate AI training

Need a lot of compute? How does 8,960 TPUs sound?

More about

More about

Narrower topics

Broader topics

More about

More about

More about

Narrower topics

Broader topics

TIP US OFF

Other stories you might like

So much for green Google ... Emissions up 48% since 2019

CyrusOne scores another $7.9B in debt financing to expand AI datacenter empire

Datacenter demand driven by AI... but constrained by power shortages

Are our IT decisions costing the planet?

AI to boost datacenter capex by 28.5% and become the top server workload

OpenAI, Google ink deals to augment AI efforts with news – it was Time for better sources

On-prem AI has arrived – the solution to cloudy problems no one really has

Microsoft tries to clear the air with mountains of CO2 credits

AI's appetite for power could double datacenter electricity bills by 2030

Google can totally explain why Chromium browsers quietly tell only its websites about your CPU, GPU usage

AMD buys developer Silo AI in bid to match Nvidia's product range

Lambda on the hunt for 'another $800M' to fuel its GPU cloud

About Us

Our Websites

Your Privacy