Deduplicate flow.executable ops during Flow dialect transformation. (#3702)

This works towards https://github.com/google/iree/issues/1144, but is not generalized beyond `flow.executable`.

Deep equivalence checks are performed between pairs of `flow.executable` ops in a new pass run after outlining dispatch regions. These executables will, by construction, have different symbol names so some special comparison logic is needed. When an executable is identified as a duplicate, it is deleted and references to it are updated to point to the canonical executable. This cleans up cases where duplicate executables are generated from the same library calls throughout a program, prior to running further compiler passes or dropping down to codegen.

I iterated on the comparison algorithm quite a bit, first printing ops to strings and checking string equality, then caching hashes of those strings using `getChildAnalysis<>` (see [this documentation](https://mlir.llvm.org/docs/PassManagement/#analysis-management)). That had the correctness and performance traits I wanted and was easy to read, but was thread-hostile.  The comparison code I settled on is still fast even without multithreading (milliseconds on mobilebert in RelWithDebInfo), thanks to efficient structural checks and plenty of places to short circuit and return early. If we find it needs speeding up, we can
1) Write a `llvm::hash_code` analysis for executables, traversing regions/blocks and using `OperationEquivalence::computeHash()` 
2) Query and compare hashes use `getChildAnalysis<>` as a first check (like a bounding box test prior to collision tests)
3) Run the existing deep equivalence check to verify that a hash collision didn't happen

The new pass emits [statistics](https://mlir.llvm.org/docs/PassManagement/#pass-statistics), like these for mobilebert (debug build + `-pass-statistics` flag):

```
mlir::iree_compiler::IREE::Flow::DeduplicateExecutablesPass
  (S) 930 total executable(s)     - Number of flow.executable ops before deduplication
  (S) 891 duplicate executable(s) - Number of flow.executable ops removed as duplicates
  (S)  39 unique executable(s)    - Number of flow.executable ops remaining after deduplication
```
6 files changed
tree: 600b387d852954c826750ed70fc6c81b96d44eba
  1. .github/
  2. bindings/
  3. build_tools/
  4. colab/
  5. docs/
  6. experimental/
  7. integrations/
  8. iree/
  9. packaging/
  10. scripts/
  11. third_party/
  12. .bazelignore
  13. .bazelrc
  14. .bazelversion
  15. .clang-format
  16. .gitignore
  17. .gitmodules
  18. .style.yapf
  19. .yamllint.yml
  20. BUILD.bazel
  21. CMakeLists.txt
  22. configure_bazel.py
  23. CONTRIBUTING.md
  24. LICENSE
  25. README.md
  26. repo_utils.bzl
  27. SUBMODULE_VERSIONS
  28. WORKSPACE
README.md

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler that lowers ML models to a unified IR optimized for real-time mobile/edge inference against heterogeneous hardware accelerators. IREE also provides flexible deployment solutions for the compiled ML models.

Project Status

IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!

Communication Channels

Related Project Channels

  • MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Getting Started

For development, IREE supports both Bazel and CMake on Windows and Linux. We are working on enabling macOS support. For deployment, IREE aims to additionally cover Android and iOS.

Please see the Getting Started pages on IREE's documentation hub to configure, compile, and run IREE in your favorite development environment!

Documentation and Talks

IREE hosts all its documentation and project status dashboards on GitHub Pages. We are still building up the website; please feel free to create issues for the documentation you'd like to see!

We also have some public talks that explain IREE's concepts and architecture:

  • 2020-03-18: Interactive HAL IR Walkthrough (Ben Vanik and core team) (recording)
  • 2020-01-31: End-to-end MLIR Workflow in IREE (recording and slides)

Architecture and Goals

IREE adopts a holistic approach towards ML model compilation: the IR produced contains both the scheduling logic, required to communicate data dependencies to low-level parallel pipelined hardware/API like Vulkan, and the execution logic, encoding dense computation on the hardware in the form of hardware/API-specific binaries like SPIR-V.

The architecture of IREE is best illustrated by the following picture:

IREE Architecture

Being compilation-based means IREE does not have a traditional runtime that dispatches “ops” to their fat kernel implementations. What IREE provides is a toolbox for different deployment scenarios. It scales from running generated code on a particular API (such as emitting C code calling external DSP kernels), to a HAL (Hardware Abstraction Layer) that allows the same generated code to target multiple APIs (like Vulkan and Direct3D 12), to a full VM allowing runtime model loading for flexible deployment options and heterogeneous execution.

IREE aims to

  • Support advanced models on mobile/edge devices. Dynamic shapes, dynamic flow control, dynamic multi-model dispatch, streaming models, tree-based search algorithms, and other are all good examples of exciting ML evolution. We are trying to build IREE from the ground-up to enable these models and run them efficiently on modern hardware, especially on mobile/edge devices.
  • Demonstrate MLIR‘s ability to develop non-traditional ML compiler backends and runtimes. MLIR enables IREE’s holistic approach of focusing on the math being performed and how that math is scheduled rather than graphs of “ops”.
  • Embrace standard-based ML via Vulkan. The graphics world is shifting towards favoring modern explicit APIs for performance and predictability and Vulkan is emerging as the “compatibility” layer. We would love to allow hardware vendors to be able to make ML efficient on their hardware without the need for bespoke runtimes and special access. We also would love to let developers and users utilize all the hardware available on as many platforms as possible.

Roadmap and Milestones

IREE is in the early stages of development and not yet ready for broad adoption. Check out the long-term design roadmap to get a sense of where we're headed.

We plan on a quarterly basis using OKRs. Review our latest objectives to get a sense of what we're up to in the near term.

We use GitHub Projects to track progress on IREE components and specific efforts. We use GitHub Milestones to track the work associated with plans for each quarter.

Build Status

CI SystemBuild SystemPlatformArchitectureComponentStatus
KokoroBazelLinuxx86Corekokoro_status_bazel_linux_x86_core
KokoroBazelLinuxx86Bindingskokoro_status_bazel_linux_x86_bindings
KokoroBazelLinuxx86-swiftshaderIntegrationskokoro_status_bazel_linux_x86-swiftshader_integrations
KokoroBazelLinuxx86-turingIntegrationskokoro_status_bazel_linux_x86-turing_integrations
KokoroCMakeLinuxx86-swiftshaderCore + Bindingskokoro_status_cmake_linux_x86-swiftshader
KokoroCMakeLinuxx86-turingCore + Bindingskokoro_status_cmake_linux_x86-turing
KokoroCMakeAndroidarm64-v8aRuntime (build only)kokoro_status_cmake_android_arm64-v8a
BuildKiteCMakeAndroidarm64-v8aRuntimebuildkite-status-cmake-android-arm

License

IREE is licensed under the terms of the Apache license. See LICENSE for more information.