Skip to content
Tech News
← Back to articles

AI Compute Extensions (ACE) Specification

read original more articles
Why This Matters

The AI Compute Extensions (ACE) specification introduces new x86 hardware features designed to accelerate machine learning workloads, particularly matrix multiplication tasks, by enhancing existing AVX capabilities with specialized registers and data processing operations. This development enables more efficient AI model training and inference, benefiting both hardware manufacturers and consumers by improving performance and energy efficiency. As AI workloads continue to grow, these extensions represent a significant step toward more powerful and optimized computing platforms.

Key Takeaways

This document defines x86 extensions for accelerating computation tasks, initially focusing on matrix multiplication kernels and reduced precision data formats important to ML workloads.

The ACE extensions define matrix multiplication primitives that augment AVX and scalar code with new capabilities, adding:

ACE register state, including tile and block scale registers

Data processing operations that consume AVX register input and operate on tile register state

Data move operations to move data between ACE register state and AVX registers

State and operations for system management

ACE provides tight integration between AVX vectors and ACE tile registers, combining high compute density tile processing operations with the comprehensive data processing features of AVX.

In addition to matrix acceleration, a number of dedicated format convert operations are provided under the AVX10 framework.