[Codegen][CPU] Add a type-polymorphic generic-scalar MMA fallback. (#24389)
Adds two new `MMAIntrinsic` values, `MMA_GENERIC_SCALAR_1x1x1_REG8` and
`_REG16`, that the data-tiling cost model picks when no element-type-
specific intrinsic on the target supports the matmul's (LHS, RHS, ACC)
types. This intentionally breaks the "an MMAIntrinsic enum value pins
down a specific element-type triple" invariant in exchange for not
having to add one enum value per supported triple. Element types live on
new `DataTiledMMAAttr.{lhs,rhs,acc}_type` parameters, populated by the
cost model only when the chosen intrinsic is one of the polymorphic
variants.
The cost model picks `_REG16` on 64-bit ISAs (x86_64, AArch64, RISC-V)
and `_REG8` on 32-bit ISAs. The number is a register-budget for the
unroll heuristic — one element of any width occupies one register, but
the architectural register file the lowering ends up in (GPR or SIMD-
scalar lane) is up to LLVM. The budget is encoded in the low byte of the
enum value, so `chooseUnrolling` can read it back.
Since the intrinsic is 1×1×1, the operand tiles after `intrinsics_m` /
`intrinsics_n` / `intrinsics_k` are simple row-major (M, K) / (N, K) /
(M, N) — `linalg.mmt4d`-shaped.
`DataTiledMMAAttr::buildUnderlyingOperations` therefore short-circuits
the swizzle/distribute pipeline for these intrinsics and emits a single
`vector.contract` directly, with `arith.extf` / `arith.extsi` widening
narrow LHS/RHS to ACC's element type. For sub-byte LHS/RHS types
`chooseUnrolling` also picks the smallest power-of-two `intrinsics_k`
such that K*lhsBits and K*rhsBits are byte-aligned (e.g. K=2 for i4/f4,
K=4 for f6, K=8 for i1).
Progress towards #24323IREE (Intermediate Representation Execution Eenvironment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
Releases notes are published on GitHub releases.
| Package | Release status |
|---|---|
| GitHub release (stable) | |
| GitHub release (nightly) | |
iree-base-compiler | |
iree-base-runtime |
For more details on the release process, see https://iree.dev/developers/general/release-management/.
| Operating system | Build status |
|---|---|
| Linux | |
| macOS | |
| macOS |
For the full list of workflows see https://iree.dev/developers/general/github-actions/.
See our website for more information.
Community meeting recordings: IREE YouTube channel
| Date | Title | Recording | Slides |
|---|---|---|---|
| 2025-06-10 | Data-Tiling in IREE: Achieving High Performance Through Compiler Design (AsiaLLVM) | recording | slides |
| 2025-05-17 | Introduction to GPU architecture and IREE's GPU CodeGen Pipeline | recording | slides |
| 2025-02-12 | The Long Tail of AI: SPIR-V in IREE and MLIR (Vulkanised) | recording | slides |
| 2024-10-01 | Unveiling the Inner Workings of IREE: An MLIR-Based Compiler for Diverse Hardware | recording | |
| 2021-06-09 | IREE Runtime Design Tech Talk | recording | slides |
| 2020-08-20 | IREE CodeGen (MLIR Open Design Meeting) | recording | slides |
| 2020-03-18 | Interactive HAL IR Walkthrough | recording | |
| 2020-01-31 | End-to-end MLIR Workflow in IREE (MLIR Open Design Meeting) | recording | slides |
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.