commit | b5851565e0bf0197ce50f530a284d3ea2ceb66a3 | [log] [tgz] |
---|---|---|
author | Ben Vanik <benvanik@google.com> | Mon Nov 08 08:36:28 2021 -0800 |
committer | GitHub <noreply@github.com> | Mon Nov 08 08:36:28 2021 -0800 |
tree | cfddc820778f833e265d832269b2c479f8db0313 | |
parent | bbb9eab49fbe516b7c5feebcd1b596a5018c03c7 [diff] |
Adding -iree-stream-schedule-execution + -concurrency passes. (#7549) The passes themselves are rather simple and call into a partitioning routine that performs the real work with the intent being that we can have many and specify which one to use based on scoped attributes in the IR (kind of like lowering configs in codegen). Today there's just a reference implementation that does a single level of concurrency. The hope is that someone who actually knows how to write a good partitioning algorithm can contribute something better, but it's at least no worse than what we have today and better than simple ML systems that have no concurrency. Though the passes are similar they operate at different scopes and will have different partitioning algorithms. I thought about trying to unify them however keeping them separate allows us to do things like use a more complex execution partitioning pass while using the same generic concurrency scheduling etc - including disabling the concurrency scheduling entirely for debugging or environments where there may be no benefits to such scheduling (single core execution, etc). It's easy enough to reason about how they could be unified that I wanted to err on the side of flexibility until we have an owner and at least one or two more algorithms we can use to feel out the shape of things. A benefit of the independent execution and concurrency partitioning is that debugging either is much simpler (and there's pretty good `-debug` output). Since the concurrency scheduling operates only within the scheduled execution regions there's no need to worry about host/device interactions or the parent op CFG.
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.