commit | eb1ce089a40a9276a71c3250044cb38c315381dc | [log] [tgz] |
---|---|---|
author | Thomas Raoux <thomasraoux@google.com> | Fri Mar 12 23:56:55 2021 +0000 |
committer | thomasraoux <thomasraoux@google.com> | Fri Mar 12 16:30:32 2021 -0800 |
tree | b1df7e9de642e7877c3190132819e59a7fc0a013 | |
parent | c9e5da3c6bb615e5bc04472e38c144b42ee694f2 [diff] | |
parent | 161f932debd7bc9690656ed99ae21d2deffd8a73 [diff] |
Merge main -> google * 161f932de Enable CUDA tests on Turing platforms (#5091) * e16759ce1 [CodeGen] Avoid eliminating output store unconditionally (#5073) * e2fe13e1c Add wasm-micro-runtime submodule and get building with CMake. (#5077) * 33c4e3946 Bump Tracy submodule to v0.7.6. (#5094) * 3f6335fe3 Use correct tied operand index when creating DispatchWorkgroupsOp (#5090) * e5cfa53cb Allow all matmul variants to be a root op (#5065) * a5a0f1d80 Merge google -> main (#5086) * 1ef76ba0c Merge pull request #5083 from google/benvanik-flow-passes * 0668b5552 Configure compiler to generate 'wasm' executables using wasm-llvm-aot. (#5079) * 6d41b9d36 Create InitTensorOp based on input operand shapes. (#5070) * bb819dfa7 Shuffling the SIP reflection attrs pass around a bit. This is in preparation f.. * 6b63dac2f Fixing deprecation warning. * 5c5d2aaa8 Removing old identify dispatch regions. Not yet the new old identify dispatch .. * 80cdccedf Splitting out input passes from flow passes. Eventually the TOSA/HLO stuff wil.. * a2a464dec Add missing CMake dep (#5081) * ca29d25d7 [CodeGen] NFC: Rename LinalgTileAndFusePass (#5075) * f3d7654d9 Add MobileNetV2 to benchmarking targets. (#5064) * 8aace663a Rename CMake target DYLIB-LLVM-AOT to just LLVM-AOT. * 98f2135ed Add WASM executable format and produce it with LLVMAOTTarget. * 2795092a4 Add CMake configuration for WAMR's runtime "iwasm VM core". * 2b5c7b92c Add third_party/wasm-micro-runtime submodule. PiperOrigin-RevId: 362617674
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler that lowers ML models to a unified IR optimized for real-time mobile/edge inference against heterogeneous hardware accelerators. IREE also provides flexible deployment solutions for the compiled ML models.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
Python packages are published on the releases page. See the colab/ directory for examples.
IREE can be built from source using both Bazel and CMake on Windows and Linux. We also have experimental macOS support.
Please see the Getting Started pages on IREE's documentation hub to configure, compile, and run IREE in your favorite development environment!
IREE hosts all its documentation and project status dashboards on GitHub Pages. We are still building up the website; please feel free to create issues for the documentation you'd like to see!
We also have some public talks that explain IREE's concepts and architecture:
IREE adopts a holistic approach towards ML model compilation: the IR produced contains both the scheduling logic, required to communicate data dependencies to low-level parallel pipelined hardware/API like Vulkan, and the execution logic, encoding dense computation on the hardware in the form of hardware/API-specific binaries like SPIR-V.
The architecture of IREE is best illustrated by the following picture:
Being compilation-based means IREE does not have a traditional runtime that dispatches “ops” to their fat kernel implementations. What IREE provides is a toolbox for different deployment scenarios. It scales from running generated code on a particular API (such as emitting C code calling external DSP kernels), to a HAL (Hardware Abstraction Layer) that allows the same generated code to target multiple APIs (like Vulkan and Direct3D 12), to a full VM allowing runtime model loading for flexible deployment options and heterogeneous execution.
IREE aims to
IREE is in the early stages of development and not yet ready for broad adoption. Check out the long-term design roadmap to get a sense of where we're headed.
We plan on a quarterly basis using OKRs. Review our latest objectives to get a sense of what we're up to in the near term.
We use GitHub Projects to track progress on IREE components and specific efforts. We use GitHub Milestones to track the work associated with plans for each quarter.
IREE is licensed under the terms of the Apache license. See LICENSE for more information.