AMD's new AI roadmap spans new Instinct GPUs, networking, software, and rack architectures

Editor's take: In the ever-evolving world of GenAI, important advances are happening across chips, software, models, networking, and systems that combine all these elements. That's what makes it so hard to keep up with the latest AI developments. The difficulty factor becomes even greater if you're a vendor building these kinds of products and working not only to keep up, but to drive those advances forward. Toss in a competitor that's virtually cornered the market – and in the process, grown into one of the world's most valuable companies – and, well, things can appear pretty challenging.

That's the situation AMD found itself in as it entered its latest Advancing AI event. But rather than letting these potential roadblocks deter them, AMD made it clear that they are inspired to expand their vision, their range of offerings, and the pace at which they are delivering new products.

From unveiling their Instinct MI400 GPU accelerators and next-generation "Vulcan" networking chips, to version 7 of their ROCm software and the debut of a new Helios Rack architecture. AMD highlighted all the key aspects of AI infrastructure and GenAI-powered solutions. In fact, one of the first takeaways from the event was how far the company's reach now extends across all the critical parts of the AI ecosystem.

AMD Instinct roadmap

As expected, there was a great deal of focus on the official launch of the Instinct MI350 and higher-wattage, faster-performing MI355X GPU-based chips, which AMD had previously announced last year. Both are built on a 3nm process and feature up to 288 MB of HBM3E memory and can be used in both liquid-cooled and air-cooled designs.

According to AMD's testing, these chips not only match Nvidia's Blackwell 200 performance levels, but even surpass them on certain benchmarks. In particular, AMD emphasized improvements in inferencing speed (over 3x faster than the previous generation), as well as cost per token (up to 40% more tokens per dollar vs. the B200, according to AMD).

AMD also provided more details on its next-generation MI400, scheduled for release next year, and even teased the MI500 for 2027. The MI400 will offer up to 432 GB of HBM4 memory, memory bandwidth of 19.6 TB/sec, and 300 GB/sec of scale-out memory bandwidth – all of which will be important for both running larger models and assembling the kinds of large rack systems expected to be needed for next-generation LLMs.

Some of the more surprising announcements from the event focused on networking.

First was a discussion of AMD's next-generation Pensando networking chip and a network interface card they're calling the AMD Pensando Pollara 400 AI NIC, which the company claims is the industry's first shipping AI-powered network card. AMD is part of the Ultra Ethernet Consortium and, not surprisingly, the Pollara 400 uses the Ultra Ethernet standard. It reportedly offers 20% improvements in speed and 20x more capacity to scale than competitive cards using InfiniBand technology.

As with its GPUs, AMD also announced its next-generation networking chip, codenamed "Vulcano," designed for large AI clusters. It will offer 800 GB/sec network speeds and up to 8x the scale-out performance for large groups of GPUs when released in 2026.

... continue reading