|  | # IREE Roadmap | 
|  |  | 
|  | ## Design | 
|  |  | 
|  | Though many of the core dialects are now in place enough for correctness testing | 
|  | a large majority of the features we are most excited to demonstrate are still | 
|  | TODO and will be coming over the next few quarters. You can find a highlighted | 
|  | set of coming features in the [design roadmap](roadmap_design.md). | 
|  |  | 
|  | ## Spring/Summer 2020 Focus Areas | 
|  |  | 
|  | IREE is able to run many foundational models and more are expected to come | 
|  | online this spring. Much of the work has been on infrastructure and getting the | 
|  | code in a place to allow for rapid parallel development and now work is ramping | 
|  | up on op coverage and completeness. There's still some core work to be done on | 
|  | the primary IREE dialects (`flow` and `hal`) prior to beginning the low-hanging | 
|  | fruit optimization burn-down, but we're getting close! | 
|  |  | 
|  | ### Frontend: Enhanced SavedModel/TF2.0 Support | 
|  |  | 
|  | We are now able to import SavedModels written in the TF2.0 style with resource | 
|  | variables and some simple usages of TensorList (`tf.TensorArray`, etc). | 
|  |  | 
|  | ### Coverage: XLA HLO Ops | 
|  |  | 
|  | A select few ops - such as ReduceWindow - are not yet implemented and need to be | 
|  | both plumbed through the HLO dialect and the IREE lowering process as well as | 
|  | implemented in the backends. Work is ongoing to complete the remaining ops such | 
|  | that we can focus on higher-level model usage semantics. | 
|  |  | 
|  | ### Scheduler: Dynamic Shapes | 
|  |  | 
|  | Progress is underway on dynamic shape support throughout the stack. The tf2xla | 
|  | effort is adding shape propagation/inference upstream and we have a decent | 
|  | amount of glue mostly ready to accept it. | 
|  |  | 
|  | ### HAL: Marl CPU Scheduling | 
|  |  | 
|  | We want to plug in [marl](https://github.com/google/marl) to provide | 
|  | [CPU-side work scheduling](roadmap_design.md#gpu-like-cpu-scheduling) that | 
|  | matches GPU semantics. This will enable improved CPU utilization and allow us to | 
|  | verify the approach with benchmarks. | 
|  |  | 
|  | ### Codegen: Full Linalg-based Conversion | 
|  |  | 
|  | A large part of the codegen story for both CPU (via LLVM IR) and GPU (via | 
|  | SPIR-V) relies on the upstream | 
|  | [Linalg dialect](https://mlir.llvm.org/docs/Dialects/Linalg/) and associated | 
|  | lowerings. We are contributing here and have partial end-to-end demonstrations | 
|  | of conversion. By the end of summer we should be fully switched over to this | 
|  | path and can remove the index-propagation-based SPIR-V lowering approach in | 
|  | favor of the more generalized solution. | 
|  |  | 
|  | ## Beyond | 
|  |  | 
|  | ### HAL: Dawn Implementation | 
|  |  | 
|  | To better engage with the WebGPU and WebML efforts we will be implementing a | 
|  | [Dawn](https://dawn.googlesource.com/dawn/) backend that uses the same generated | 
|  | SPIR-V kernels as the Vulkan backend which enables us to target Metal, Direct3D | 
|  | 12, and WebGPU. The goal is to get something working in place (even if | 
|  | suboptimal) such that we can provide feedback to the various efforts. |