Meta's new MTIA lineup joins hyperscalers' unified push for dedicated inferencing chips — companies diversify AI chips in effort to diversify from sole reliance on Nvidia

Meta announced four successive generations of its custom Meta Training and Inference Accelerator (MTIA) chips on March 11: The MTIA 300, 400, 450, and 500, all scheduled for deployment over the next two years. Meta described the chips as progressively optimized for AI inference workloads on the premise that HBM memory bandwidth is the binding constraint on inference.

Coming two weeks after Meta disclosed a long-term AI infrastructure with AMD, the announcement puts Meta alongside Google, AWS, and Microsoft, each of which has spent the last few years building and scaling custom silicon programs for AI accelerated workloads. Will this emerging class of chips put a dent in Nvidia's stranglehold on the AI chip industry?

An inference case against GPUs

In a technical blog post published alongside the announcement, Meta described HBM's bandwidth as the most important factor affecting AI inference performance, adding that mainstream chips, built for large-scale pre-training, are then applied less cost-effectively to inference workloads.

Article continues below

“We doubled HBM bandwidth from MTIA 400 to 450, making it much higher than that of existing leading commercial products,” it reads. The MTIA 500 then increases HBM bandwidth again by an additional 50% compared with the MTIA 450. Both chips are optimized primarily for AI inference but can be applied to other workloads, including training as a secondary use case.

The MTIA 300 is already in production for ranking and recommendations training. Meanwhile, the MTIA 400 — which features a 72-accelerator scale-up domain and performance — has completed lab testing and is on the path to data center deployment. The 450 and 500 are scheduled for mass deployment in early 2027 and later in 2027, respectively.

Across the full 300-to-500 progression, HBM bandwidth increases 4.5 times and compute FLOPs increase 25 times, with the MTIA 450's HBM bandwidth exceeding that of existing leading commercial products, while the MTIA 500 adds another 50% on top, along with up to 80% more HBM capacity.

According to Meta, the chips use a modular chiplet architecture that allows the MTIA 400, 450, and 500 to share the same chassis, rack, and network infrastructure. That compatibility means each new chip generation drops into the existing physical footprint without requiring new data center buildouts, the mechanism Meta cited for its roughly six-month development cadence, well faster than the industry's typical one-to-two year cycle. “More importantly, we have deployed hundreds of thousands of MTIA chips in production, onboarded numerous internal production models, and tested MTIA with large language models (LLMs) like Llama.”

Swipe to scroll horizontally MTIA chips Row 0 - Cell 0 MTIA 300 MTIA 400 MTIA 450 MTIA 500 Workload Focus R&R Training General AI Inference AI Inference Module TDP 800 W 1,200 W 1,400 W 1,700 W HBM Bandwidth 6.1 TB/s 9.2 TB/s 18.4 TB/s 27.6 TB/s HBM Capacity 216 GB 288 GB 288 GB 384-512 GB MX4 Performance - 12 PFLOPS 21 PFLOPS 30 PLOPS FP8/MX8 Performance 1.2 PFLOPS 6 PFLOPS 7 PFLOPS 10 PFLOPS BF16 Performance 0.6 PLOPS 3 PFLOPS 3.5 PFLOPS 5 PFLOPS

... continue reading