Skip to content
Tech News
← Back to articles

[x86] AI Compute Extensions (ACE) Specification

read original more articles
Why This Matters

The x86 AI Compute Extensions (ACE) specification introduces new hardware capabilities to accelerate machine learning workloads, particularly matrix multiplication, by enhancing existing AVX instructions with specialized registers and data processing operations. This development enables more efficient AI computations directly on x86 processors, benefiting both industry developers and consumers by improving performance and energy efficiency in AI applications.

Key Takeaways

This document defines x86 extensions for accelerating computation tasks, initially focusing on matrix multiplication kernels and reduced precision data formats important to ML workloads.

The ACE extensions define matrix multiplication primitives that augment AVX and scalar code with new capabilities, adding:

ACE register state, including tile and block scale registers

Data processing operations that consume AVX register input and operate on tile register state

Data move operations to move data between ACE register state and AVX registers

State and operations for system management

ACE provides tight integration between AVX vectors and ACE tile registers, combining high compute density tile processing operations with the comprehensive data processing features of AVX.

In addition to matrix acceleration, a number of dedicated format convert operations are provided under the AVX10 framework.