commit	02d145e5a948283df3cd30289fa68fc9ceb602cb	[log] [tgz]
author	Han-Chung Wang <hanhan0912@gmail.com>	Wed Jan 08 19:39:49 2025 -0800
committer	GitHub <noreply@github.com>	Thu Jan 09 03:39:49 2025 +0000
tree	6a37230fa8cb2c4ae1e88451f2c13b50f127c0a6
parent	74f8d3c6b9800b431c42573df523e5d06c9a51d4 [diff]

[Stream] Implement SpecializeEncodings pass (1/n) (#19502)

There are three major changes in the revision:

- Introduce `AffinityAnalysisDialectInterface` Stream dialect interface.
It is used to fetch attributes that are defined by other dialects. In
the revision, HAL implements the dialect interface, and it can return
whatever attribute attached in HAL::ExecutableTarget attributes. The
main idea of the dialect interface is that Stream **does not** need to
depend on HAL to get the layout information.
- Add `cloneWithLayouts` method to the EncodingAttr. It is used in the
encoding specialization pass where it can resolve the layout
requirements and add it to the `layouts` field. The other optional
parameters are dropped because the layout is already resolved. It can be
a new Encoding dialect attribute because it is just describing the
layout. The stream tensor ops do not need to know the `op_type`,
`element_types` and `operand_index` parameters. It only needs the layout
information, and the attribute should implement the interface method.
- Partially implement the SpecializeEncodings pass. The responsibility
of the pass is large, so I decide to implement it incrementally. This
revision only implements the mechanism of updating stream tensor ops'
encoding, and only stream.tensor.sizeof op is supported. The rest of the
support for other stream tensor op can be added later on. The executable
duplication and the update of dispatch ops will be implemented in
subsequent PRs.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>

16 files changed

tree: 6a37230fa8cb2c4ae1e88451f2c13b50f127c0a6

README.md

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.

See our website for project details, user guides, and instructions on building from source.

Project news

2024-05-23: IREE joins the LF AI & Data Foundation as a sandbox-stage project

Project status

Release status

Releases notes are published on GitHub releases.

Package	Release status
GitHub release (stable)
GitHub release (nightly)
Python iree-base-compiler
Python iree-base-runtime

Build status

Nightly build status

Operating system	Build status
Linux
macOS
Windows

For the full list of workflows see https://iree.dev/developers/general/github-actions/.

Communication channels

GitHub issues: Feature requests, bugs, and other work tracking
IREE Discord server: Daily development discussions with the core team and collaborators
(New) iree-announce email list: Announcements
(New) iree-technical-discussion email list: General and low-priority discussion
(Legacy) iree-discuss email list: Announcements, general and low-priority discussion

Related project channels

MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Architecture overview

IREE Architecture

See our website for more information.

Presentations and talks

Community meeting recordings: IREE YouTube channel

Date	Title	Recording	Slides
2021-06-09	IREE Runtime Design Tech Talk	recording	slides
2020-08-20	IREE CodeGen (MLIR Open Design Meeting)	recording	slides
2020-03-18	Interactive HAL IR Walkthrough	recording
2020-01-31	End-to-end MLIR Workflow in IREE (MLIR Open Design Meeting)	recording	slides

License

IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.