| commit | e2648074e757662995828943aac945c5a37598b2 | [log] [tgz] |
|---|---|---|
| author | Ben Vanik <ben.vanik@gmail.com> | Sat Feb 07 19:44:25 2026 -0800 |
| committer | GitHub <noreply@github.com> | Sat Feb 07 19:44:25 2026 -0800 |
| tree | 78b1fc95f147551b514f79e086286a1f9f273c54 | |
| parent | 23c9c65f045ce784838ad41701bde596df6f453f [diff] |
Adding dynamic parameter scope and key with !util.buffer operands. (#23426)
Changes parameter operations across the dialect stack from StringAttr
compile-time constant scope and key to !util.buffer SSA value operands,
enabling dynamic parameter loading where scope and key are computed at
runtime rather than known at compile time. Previously all parameter
operations required static string literals for scope (optional
namespace) and key (parameter name), preventing use cases like
frank-models where the scope depends on model metadata (architecture,
variant, quantization) determined during model initialization and
parameter keys may be constructed from loop indices or other runtime
values.
Flow dialect adds flow.parameter.load and flow.parameter.write
operations with !util.buffer operands for scope (optional) and key. The
load op allocates a new tensor and populates it from a parameter archive
at the specified byte offset. The write op stores tensor data to a
parameter archive and returns the source tensor unchanged to maintain
data flow dependencies. Stream dialect adds stream.tensor.parameter.load
and stream.tensor.parameter.write as tensor-phase equivalents retaining
encoding metadata for size calculation, and updates existing
stream.async.parameter.{load,read,write} operations to accept
!util.buffer operands instead of StringAttr attributes. The change from
attributes to operands requires adding AttrSizedOperandSegments trait to
handle optional scope operands. IO parameters module updates
io_parameters.{load,gather,scatter} operations to accept !util.buffer
operands for scope and variadic buffer keys.
Conversion patterns updated across the stack. FlowToStream converts
flow.parameter.load to stream.tensor.parameter.load forwarding scope and
key operands unchanged. StreamToParams lowers stream parameter
operations to io_parameters operations, passing !util.buffer operands
through. ParamsToVM converts io_parameters operations to vm.call
invoking the parameter module, with !util.buffer operands type-converted
to vm.ref<!vm.buffer> through the standard type converter. Stream
transform passes updated to handle parameters uniformly regardless of
whether scope and key are compile-time hoisted resources or
runtime-computed buffers. ElideTimepoints treats parameter load await
results as passthrough when timepoint is ready, PropagateTimepoints
inserts awaits for unresolved timepoints in parameter loads tracking
producer-consumer dependencies, EncodeTensors handles tensor variants
inserting encoding metadata queries and validation around loads,
PackConstants skips hoisting for parameters with non-constant scope or
key preserving runtime construction, ScheduleAllocation treats parameter
reads as device-local External lifetime allocations, and
SplitParameterEncoder detects when scope/key buffers will be hoisted to
module resources during encoding.
This enables frank-models and similar use cases where parameter
organization depends on model configuration determined at runtime,
allowing a single compiled module to serve multiple model variants by
selecting parameters based on metadata rather than requiring separate
compilation per variant.
Co-authored-by: Claude <noreply@anthropic.com>IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
Releases notes are published on GitHub releases.
| Package | Release status |
|---|---|
| GitHub release (stable) | |
| GitHub release (nightly) | |
iree-base-compiler | |
iree-base-runtime |
For more details on the release process, see https://iree.dev/developers/general/release-management/.
| Operating system | Build status |
|---|---|
| Linux | |
| macOS | |
| macOS |
For the full list of workflows see https://iree.dev/developers/general/github-actions/.
See our website for more information.
Community meeting recordings: IREE YouTube channel
| Date | Title | Recording | Slides |
|---|---|---|---|
| 2025-06-10 | Data-Tiling in IREE: Achieving High Performance Through Compiler Design (AsiaLLVM) | recording | slides |
| 2025-05-17 | Introduction to GPU architecture and IREE's GPU CodeGen Pipeline | recording | slides |
| 2025-02-12 | The Long Tail of AI: SPIR-V in IREE and MLIR (Vulkanised) | recording | slides |
| 2024-10-01 | Unveiling the Inner Workings of IREE: An MLIR-Based Compiler for Diverse Hardware | recording | |
| 2021-06-09 | IREE Runtime Design Tech Talk | recording | slides |
| 2020-08-20 | IREE CodeGen (MLIR Open Design Meeting) | recording | slides |
| 2020-03-18 | Interactive HAL IR Walkthrough | recording | |
| 2020-01-31 | End-to-end MLIR Workflow in IREE (MLIR Open Design Meeting) | recording | slides |
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.