commit	db62c355db54a57edc424915d0d75ff9ba3d7c73	[log] [tgz]
author	MaheshRavishankar <1663364+MaheshRavishankar@users.noreply.github.com>	Thu Feb 18 21:20:36 2021 -0800
committer	GitHub <noreply@github.com>	Thu Feb 18 21:20:36 2021 -0800
tree	96cf4000a3c57325c9f51bb6e9840b843b387925
parent	7cd207d194ba33903f5c6818944d2ae9716b9a6f [diff]

Add compilation path for lowering linalg on tensors op without using tile and distribute. (#4840)

Not all operations in a dispatch region can be lowered through tile
and distribute. Operations like linalg.tensor_reshape cannot be
tiled. These needs a separate lowering mechanism. In this PR, such a
mechanism is added where such operations are put into the dispatch
region as is. Its upto the individual backends to handle these
appropriately.

On the GPU side trivially parallel operations are distribute across
all invocations.
On the CPU side such operations are executed sequentially for
now. This could be modified to execute in parallel too, but the
sequential execution is consistent with existing codegen path.
The PR also adds support for lowering linalg.tensor_reshape operation
by making LinalgBufferize handle this operation (further clean up
related to Issue #4759 with removal of dead code)

Instead of relying on ad-hoc pattern matching of the code within the
dispatch region, and explicit flag is used to distinguish between the
Linalg on tensors path and legacy path.

On CPU the LinalgTileAndDistribute pass is split to have a pass
MaterializeCPULaunchConfigurationPass. The former now is only for
the legacy path, and the latter for the linalg on tensors path.
On GPU the ConvertToGPUPass is used to distribute untiled operations
in dispatch region to invocations.

39 files changed

tree: 96cf4000a3c57325c9f51bb6e9840b843b387925

README.md

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler that lowers ML models to a unified IR optimized for real-time mobile/edge inference against heterogeneous hardware accelerators. IREE also provides flexible deployment solutions for the compiled ML models.

Project Status

IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!

Communication Channels

GitHub Issues: Preferred for specific technical issues and coordination on upcoming features.
Google IREE Discord Server: The core team and collaborators coordinate daily development here; good for low-latency communication.
Google Groups Email List: Good for general and low-priority discussion.

Related Project Channels

MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Getting Started

Quick Start using Python

Python packages are published on the releases page. See the colab/ directory for examples.

Building from Source

IREE can be built from source using both Bazel and CMake on Windows and Linux. We also have experimental macOS support.

Please see the Getting Started pages on IREE's documentation hub to configure, compile, and run IREE in your favorite development environment!

Documentation and Talks

IREE hosts all its documentation and project status dashboards on GitHub Pages. We are still building up the website; please feel free to create issues for the documentation you'd like to see!

We also have some public talks that explain IREE's concepts and architecture:

2020-03-18: Interactive HAL IR Walkthrough (Ben Vanik and core team) (recording)
2020-01-31: End-to-end MLIR Workflow in IREE (recording and slides)

Architecture and Goals

IREE adopts a holistic approach towards ML model compilation: the IR produced contains both the scheduling logic, required to communicate data dependencies to low-level parallel pipelined hardware/API like Vulkan, and the execution logic, encoding dense computation on the hardware in the form of hardware/API-specific binaries like SPIR-V.

The architecture of IREE is best illustrated by the following picture:

IREE Architecture

Being compilation-based means IREE does not have a traditional runtime that dispatches “ops” to their fat kernel implementations. What IREE provides is a toolbox for different deployment scenarios. It scales from running generated code on a particular API (such as emitting C code calling external DSP kernels), to a HAL (Hardware Abstraction Layer) that allows the same generated code to target multiple APIs (like Vulkan and Direct3D 12), to a full VM allowing runtime model loading for flexible deployment options and heterogeneous execution.

IREE aims to

Support advanced models on mobile/edge devices. Dynamic shapes, dynamic flow control, dynamic multi-model dispatch, streaming models, tree-based search algorithms, and other are all good examples of exciting ML evolution. We are trying to build IREE from the ground-up to enable these models and run them efficiently on modern hardware, especially on mobile/edge devices.
Demonstrate MLIR‘s ability to develop non-traditional ML compiler backends and runtimes. MLIR enables IREE’s holistic approach of focusing on the math being performed and how that math is scheduled rather than graphs of “ops”.
Embrace standard-based ML via Vulkan. The graphics world is shifting towards favoring modern explicit APIs for performance and predictability and Vulkan is emerging as the “compatibility” layer. We would love to allow hardware vendors to be able to make ML efficient on their hardware without the need for bespoke runtimes and special access. We also would love to let developers and users utilize all the hardware available on as many platforms as possible.

Roadmap and Milestones

IREE is in the early stages of development and not yet ready for broad adoption. Check out the long-term design roadmap to get a sense of where we're headed.

We plan on a quarterly basis using OKRs. Review our latest objectives to get a sense of what we're up to in the near term.

We use GitHub Projects to track progress on IREE components and specific efforts. We use GitHub Milestones to track the work associated with plans for each quarter.

Build Status

CI System	Build System	Platform	Architecture	Component
Kokoro	Bazel	Linux	x86	Core
Kokoro	CMake & Bazel	Linux	x86-swiftshader	Integrations
Kokoro	CMake & Bazel	Linux	x86-turing	Integrations
Kokoro	CMake	Linux	x86-swiftshader	Core + Bindings
Kokoro	CMake	Linux	x86-turing	Core + Bindings
Kokoro	CMake	Android	arm64-v8a	Runtime (build only)
BuildKite	CMake	Android	arm64-v8a	Runtime

License

IREE is licensed under the terms of the Apache license. See LICENSE for more information.