[AMDGPU] Embed prebuilt AMDGPU device binaries (#24498)
The AMDGPU HAL runtime needs a small builtin device library for blit,
dispatch, and timestamp kernels. Building that library through the
normal runtime target required an AMDGPU-capable clang from the
third_party LLVM checkout, making accidental runtime rebuilds extremely
expensive and tying the runtime to the monorepo compiler checkout.
This adds a device/binaries package that embeds checked-in HSACO code
objects for the default generic support set: gfx9-generic, gfx90a,
gfx9-4-generic, gfx10-1-generic, gfx10-3-generic, gfx11-generic, and
gfx12-generic. Default builds consume those blobs and do not require
ROCm or LLVM producer tools.
Default/prebuilt scenarios:
```
iree-bazel-build //runtime/src/iree/hal/drivers/amdgpu/device/binaries:toc
cmake -DIREE_HAL_AMDGPU_DEVICE_BINARY_BUILD_MODE=prebuilt ...
```
Live source-build scenarios for HAL developers:
```
iree-bazel-build \
--config=amdgpu_device_binaries_source_rocm \
--repo_env=IREE_HAL_AMDGPU_DEVICE_TOOLCHAIN_ROCM_PATH=/path/to/rocm \
//runtime/src/iree/hal/drivers/amdgpu/device/binaries:toc
iree-bazel-build \
--config=amdgpu_device_binaries_source_llvm_project \
//runtime/src/iree/hal/drivers/amdgpu/device/binaries:toc
cmake \
-DIREE_HAL_AMDGPU_DEVICE_BINARY_BUILD_MODE=source \
-DIREE_HAL_AMDGPU_DEVICE_TOOLCHAIN=rocm \
-DIREE_HAL_AMDGPU_DEVICE_TOOLCHAIN_ROCM_PATH=/path/to/rocm ...
```
Regenerating the checked-in set is explicit:
```
python build_tools/scripts/amdgpu_device_binaries.py \
--output-dir runtime/src/iree/hal/drivers/amdgpu/device/binaries/prebuilt \
--rocm-path /path/to/rocm \
--targets gfx9-generic,gfx90a,gfx9-4-generic,gfx10-1-generic,gfx10-3-generic,gfx11-generic,gfx12-generic
```
The target map is authored once in Python and generates the Bazel,
CMake, and runtime lookup fragments. That keeps the ROCm/TheRock family
vocabulary, exact target map, and generic code-object fallback table
aligned without adding a manifest whose inventory, history, and hashes
would duplicate Git.
The pragmatic tradeoff is accepting checked-in binaries for code that
changes rarely. The current seven HSACO files are about 151 KiB raw,
about 40 KiB as loose Git blobs, and about 27 KiB in a Git pack, so the
repository cost is a few tens of KiB instead of a mandatory LLVM/ROCm
dependency on every runtime build.
This does not add general AMDGPU fatbin support. The builtin runtime
library now has a TOC of target-specific blobs and runtime selection;
CTS, samples, and other normal HAL executables still use the existing
ROCm test target plumbing until the AMDGPU executable format grows real
multi-variant support.IREE (Intermediate Representation Execution Eenvironment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
Releases notes are published on GitHub releases.
| Package | Release status |
|---|---|
| GitHub release (stable) | |
| GitHub release (nightly) | |
iree-base-compiler | |
iree-base-runtime |
For more details on the release process, see https://iree.dev/developers/general/release-management/.
| Operating system | Build status |
|---|---|
| Linux | |
| macOS | |
| macOS |
For the full list of workflows see https://iree.dev/developers/general/github-actions/.
See our website for more information.
Community meeting recordings: IREE YouTube channel
| Date | Title | Recording | Slides |
|---|---|---|---|
| 2025-06-10 | Data-Tiling in IREE: Achieving High Performance Through Compiler Design (AsiaLLVM) | recording | slides |
| 2025-05-17 | Introduction to GPU architecture and IREE's GPU CodeGen Pipeline | recording | slides |
| 2025-02-12 | The Long Tail of AI: SPIR-V in IREE and MLIR (Vulkanised) | recording | slides |
| 2024-10-01 | Unveiling the Inner Workings of IREE: An MLIR-Based Compiler for Diverse Hardware | recording | |
| 2021-06-09 | IREE Runtime Design Tech Talk | recording | slides |
| 2020-08-20 | IREE CodeGen (MLIR Open Design Meeting) | recording | slides |
| 2020-03-18 | Interactive HAL IR Walkthrough | recording | |
| 2020-01-31 | End-to-end MLIR Workflow in IREE (MLIR Open Design Meeting) | recording | slides |
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.