blob: b979de760ed910d55112215bf99c8c953816ce35 [file] [log] [blame] [view]
Ben Vanik640dfe82019-10-13 11:38:38 -07001# IREE Roadmap
2
Ben Vanik8f440842020-03-11 09:41:14 -07003## Design
Ben Vanik640dfe82019-10-13 11:38:38 -07004
Ben Vanik8f440842020-03-11 09:41:14 -07005Though many of the core dialects are now in place enough for correctness testing
6a large majority of the features we are most excited to demonstrate are still
7TODO and will be coming over the next few quarters. You can find a highlighted
8set of coming features in the [design roadmap](roadmap_design.md).
Ben Vanik640dfe82019-10-13 11:38:38 -07009
Geoffrey Martin-Noblebcc0db52020-04-13 16:31:47 -070010## Spring/Summer 2020 Focus Areas
Ben Vanik640dfe82019-10-13 11:38:38 -070011
Ben Vanik8f440842020-03-11 09:41:14 -070012IREE is able to run many foundational models and more are expected to come
13online this spring. Much of the work has been on infrastructure and getting the
14code in a place to allow for rapid parallel development and now work is ramping
15up on op coverage and completeness. There's still some core work to be done on
16the primary IREE dialects (`flow` and `hal`) prior to beginning the low-hanging
17fruit optimization burn-down, but we're getting close!
18
19### Frontend: Enhanced SavedModel/TF2.0 Support
20
21We are now able to import SavedModels written in the TF2.0 style with resource
22variables and some simple usages of TensorList (`tf.TensorArray`, etc).
Ben Vanik640dfe82019-10-13 11:38:38 -070023
24### Coverage: XLA HLO Ops
25
Ben Vanik8f440842020-03-11 09:41:14 -070026A select few ops - such as ReduceWindow - are not yet implemented and need to be
27both plumbed through the HLO dialect and the IREE lowering process as well as
28implemented in the backends. Work is ongoing to complete the remaining ops such
29that we can focus on higher-level model usage semantics.
Ben Vanik640dfe82019-10-13 11:38:38 -070030
Ben Vanik8f440842020-03-11 09:41:14 -070031### Scheduler: Dynamic Shapes
Ben Vanik640dfe82019-10-13 11:38:38 -070032
Ben Vanik8f440842020-03-11 09:41:14 -070033Progress is underway on dynamic shape support throughout the stack. The tf2xla
34effort is adding shape propagation/inference upstream and we have a decent
35amount of glue mostly ready to accept it.
Ben Vanik640dfe82019-10-13 11:38:38 -070036
Ben Vanik8f440842020-03-11 09:41:14 -070037### HAL: Marl CPU Scheduling
Ben Vanik640dfe82019-10-13 11:38:38 -070038
Ben Vanik8f440842020-03-11 09:41:14 -070039We want to plug in [marl](https://github.com/google/marl) to provide
40[CPU-side work scheduling](roadmap_design.md#gpu-like-cpu-scheduling) that
41matches GPU semantics. This will enable improved CPU utilization and allow us to
42verify the approach with benchmarks.
43
44### Codegen: Full Linalg-based Conversion
45
46A large part of the codegen story for both CPU (via LLVM IR) and GPU (via
47SPIR-V) relies on the upstream
48[Linalg dialect](https://mlir.llvm.org/docs/Dialects/Linalg/) and associated
49lowerings. We are contributing here and have partial end-to-end demonstrations
50of conversion. By the end of summer we should be fully switched over to this
51path and can remove the index-propagation-based SPIR-V lowering approach in
52favor of the more generalized solution.
53
54## Beyond
Ben Vanik640dfe82019-10-13 11:38:38 -070055
56### HAL: Dawn Implementation
57
58To better engage with the WebGPU and WebML efforts we will be implementing a
59[Dawn](https://dawn.googlesource.com/dawn/) backend that uses the same generated
Ben Vanik8f440842020-03-11 09:41:14 -070060SPIR-V kernels as the Vulkan backend which enables us to target Metal, Direct3D
Ben Vanik640dfe82019-10-13 11:38:38 -07006112, and WebGPU. The goal is to get something working in place (even if
62suboptimal) such that we can provide feedback to the various efforts.