Blame - docs/roadmap.md - 3p/openxla/iree

blob: b979de760ed910d55112215bf99c8c953816ce35 [file] [log] [blame] [view]

Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	1	# IREE Roadmap
				2
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	3	## Design
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	4
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	5	Though many of the core dialects are now in place enough for correctness testing
				6	a large majority of the features we are most excited to demonstrate are still
				7	TODO and will be coming over the next few quarters. You can find a highlighted
				8	set of coming features in the [design roadmap](roadmap_design.md).
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	9
Geoffrey Martin-Noble	bcc0db5	2020-04-13 16:31:47 -0700	[diff] [blame]	10	## Spring/Summer 2020 Focus Areas
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	11
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	12	IREE is able to run many foundational models and more are expected to come
				13	online this spring. Much of the work has been on infrastructure and getting the
				14	code in a place to allow for rapid parallel development and now work is ramping
				15	up on op coverage and completeness. There's still some core work to be done on
				16	the primary IREE dialects (`flow` and `hal`) prior to beginning the low-hanging
				17	fruit optimization burn-down, but we're getting close!
				18
				19	### Frontend: Enhanced SavedModel/TF2.0 Support
				20
				21	We are now able to import SavedModels written in the TF2.0 style with resource
				22	variables and some simple usages of TensorList (`tf.TensorArray`, etc).
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	23
				24	### Coverage: XLA HLO Ops
				25
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	26	A select few ops - such as ReduceWindow - are not yet implemented and need to be
				27	both plumbed through the HLO dialect and the IREE lowering process as well as
				28	implemented in the backends. Work is ongoing to complete the remaining ops such
				29	that we can focus on higher-level model usage semantics.
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	30
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	31	### Scheduler: Dynamic Shapes
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	32
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	33	Progress is underway on dynamic shape support throughout the stack. The tf2xla
				34	effort is adding shape propagation/inference upstream and we have a decent
				35	amount of glue mostly ready to accept it.
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	36
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	37	### HAL: Marl CPU Scheduling
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	38
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	39	We want to plug in [marl](https://github.com/google/marl) to provide
				40	[CPU-side work scheduling](roadmap_design.md#gpu-like-cpu-scheduling) that
				41	matches GPU semantics. This will enable improved CPU utilization and allow us to
				42	verify the approach with benchmarks.
				43
				44	### Codegen: Full Linalg-based Conversion
				45
				46	A large part of the codegen story for both CPU (via LLVM IR) and GPU (via
				47	SPIR-V) relies on the upstream
				48	[Linalg dialect](https://mlir.llvm.org/docs/Dialects/Linalg/) and associated
				49	lowerings. We are contributing here and have partial end-to-end demonstrations
				50	of conversion. By the end of summer we should be fully switched over to this
				51	path and can remove the index-propagation-based SPIR-V lowering approach in
				52	favor of the more generalized solution.
				53
				54	## Beyond
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	55
				56	### HAL: Dawn Implementation
				57
				58	To better engage with the WebGPU and WebML efforts we will be implementing a
				59	[Dawn](https://dawn.googlesource.com/dawn/) backend that uses the same generated
Ben Vanik	8f44084	2020-03-11 09:41:14 -0700	[diff] [blame]	60	SPIR-V kernels as the Vulkan backend which enables us to target Metal, Direct3D
Ben Vanik	640dfe8	2019-10-13 11:38:38 -0700	[diff] [blame]	61	12, and WebGPU. The goal is to get something working in place (even if
				62	suboptimal) such that we can provide feedback to the various efforts.