Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 1 | # IREE Roadmap |
| 2 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 3 | ## Design |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 4 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 5 | Though many of the core dialects are now in place enough for correctness testing |
| 6 | a large majority of the features we are most excited to demonstrate are still |
| 7 | TODO and will be coming over the next few quarters. You can find a highlighted |
| 8 | set of coming features in the [design roadmap](roadmap_design.md). |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 9 | |
Geoffrey Martin-Noble | bcc0db5 | 2020-04-13 16:31:47 -0700 | [diff] [blame] | 10 | ## Spring/Summer 2020 Focus Areas |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 11 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 12 | IREE is able to run many foundational models and more are expected to come |
| 13 | online this spring. Much of the work has been on infrastructure and getting the |
| 14 | code in a place to allow for rapid parallel development and now work is ramping |
| 15 | up on op coverage and completeness. There's still some core work to be done on |
| 16 | the primary IREE dialects (`flow` and `hal`) prior to beginning the low-hanging |
| 17 | fruit optimization burn-down, but we're getting close! |
| 18 | |
| 19 | ### Frontend: Enhanced SavedModel/TF2.0 Support |
| 20 | |
| 21 | We are now able to import SavedModels written in the TF2.0 style with resource |
| 22 | variables and some simple usages of TensorList (`tf.TensorArray`, etc). |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 23 | |
| 24 | ### Coverage: XLA HLO Ops |
| 25 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 26 | A select few ops - such as ReduceWindow - are not yet implemented and need to be |
| 27 | both plumbed through the HLO dialect and the IREE lowering process as well as |
| 28 | implemented in the backends. Work is ongoing to complete the remaining ops such |
| 29 | that we can focus on higher-level model usage semantics. |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 30 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 31 | ### Scheduler: Dynamic Shapes |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 32 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 33 | Progress is underway on dynamic shape support throughout the stack. The tf2xla |
| 34 | effort is adding shape propagation/inference upstream and we have a decent |
| 35 | amount of glue mostly ready to accept it. |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 36 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 37 | ### HAL: Marl CPU Scheduling |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 38 | |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 39 | We want to plug in [marl](https://github.com/google/marl) to provide |
| 40 | [CPU-side work scheduling](roadmap_design.md#gpu-like-cpu-scheduling) that |
| 41 | matches GPU semantics. This will enable improved CPU utilization and allow us to |
| 42 | verify the approach with benchmarks. |
| 43 | |
| 44 | ### Codegen: Full Linalg-based Conversion |
| 45 | |
| 46 | A large part of the codegen story for both CPU (via LLVM IR) and GPU (via |
| 47 | SPIR-V) relies on the upstream |
| 48 | [Linalg dialect](https://mlir.llvm.org/docs/Dialects/Linalg/) and associated |
| 49 | lowerings. We are contributing here and have partial end-to-end demonstrations |
| 50 | of conversion. By the end of summer we should be fully switched over to this |
| 51 | path and can remove the index-propagation-based SPIR-V lowering approach in |
| 52 | favor of the more generalized solution. |
| 53 | |
| 54 | ## Beyond |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 55 | |
| 56 | ### HAL: Dawn Implementation |
| 57 | |
| 58 | To better engage with the WebGPU and WebML efforts we will be implementing a |
| 59 | [Dawn](https://dawn.googlesource.com/dawn/) backend that uses the same generated |
Ben Vanik | 8f44084 | 2020-03-11 09:41:14 -0700 | [diff] [blame] | 60 | SPIR-V kernels as the Vulkan backend which enables us to target Metal, Direct3D |
Ben Vanik | 640dfe8 | 2019-10-13 11:38:38 -0700 | [diff] [blame] | 61 | 12, and WebGPU. The goal is to get something working in place (even if |
| 62 | suboptimal) such that we can provide feedback to the various efforts. |