Intel and SambaNova team up on heterogenous AI inference platform — different hardware performs different workloads

Intel and SambaNova on Wednesday announced their joint production-ready heterogeneous inference architecture that relies on AI accelerators or GPUs for prefill, SambaNova reconfigurable dataflow units (RDUs) SN50 for decode, and Xeon 6 processors for agentic tools and system orchestration. The platform is designed to address as broad a set of workloads as possible to siphon some of the market share away from Nvidia and other emerging players.

The heterogeneous inference platform by Intel and SambaNova separates inference into distinct stages handled by different silicon: It uses AI GPUs or AI accelerators for ingesting long prompts and building key-value caches; SambaNova's SN50 RDU for decoding and generating tokens; and Xeon 6 processors for running agent-related operations (e.g., compiling and executing code and validating outputs) as well as coordinating and distributing workloads across hardware.

Splitting prefill, decode, and token generation stages is similar to Nvidia's approach to its Rubin platform, which is based on the Rubin CPX and heavy-duty Rubin GPU with HBM4 memory — with the obvious difference that the Rubin CPX is not coming to market. But, more importantly for Intel, the new platform will rely on its Xeon 6 processors — not on competing offerings.

(Image credit: SambaNova)

The solution is scheduled to be available in the second half of 2026 to enterprises, cloud operators, and sovereign AI programs seeking scalable inference platforms in general and coding agents, and other agentic workloads in particular, completely in-house.

According to SambaNova's internal data, Xeon 6 achieves over 50% faster LLVM compilation compared to Arm-based server CPUs, and delivers up to 70% higher performance in vector database workloads, relative to competing x86 processors — namely, AMD EPYC. These gains are intended to shorten end-to-end development cycles for coding agents and similar applications, the two companies claim.

Perhaps the biggest advantage of the joint production-ready heterogeneous inference architecture is that SambaNova SN50 and Xeon-based servers are drop-in compatible with data centers that can handle 30kW — which is the vast majority of enterprise data centers.

... continue reading