Refresh IREE README.md This commit updates IREE's README.md to reflect updated status. It also condenses the README.md page, aiming to make it as a landing page to provide pointers to proper documentation pages. PiperOrigin-RevId: 305560725

commit: c4cfa7986b36d91d9c79c9d000299d369e74c6b6 [log] [tgz]
author: Lei Zhang <antiagainst@google.com> Wed Apr 08 14:55:41 2020 -0700
committer: Copybara-Service <copybara-worker@google.com> Wed Apr 08 14:56:58 2020 -0700
tree: 09d33134720c4ff23eac65c42e3487896bc7d502
parent: fdd98f3c633e06565546c7d7c2a4a8005da4d098 [diff] [blame]
diff --git a/README.md b/README.md
index 9671d12..b583d1e 100644
--- a/README.md
+++ b/README.md

@@ -1,36 +1,112 @@
-# IREE: An Experimental MLIR Execution Environment
+# IREE: Intermediate Representation Execution Environment
 
-**DISCLAIMER**: This is an early phase project that we hope will graduate into a
-supported form someday, but it is far from ready for everyday use and is made
-available without any support. With that said, feel free to browse the issues
-and reach out on the
-[iree-discuss mailing list](https://groups.google.com/forum/#!forum/iree-discuss).
+IREE (**I**ntermediate **R**epresentation **E**xecution **E**nvironment,
+pronounced as "eerie") is an MLIR-based end-to-end compiler that lowers ML
+models to a unified IR optimized for real-time mobile/edge inference against
+heterogeneous hardware accelerators. IREE also provides flexible deployment
+solutions for the compiled ML models.
 
-## Table of Contents
+#### Project Status
 
--   [Contact](#contact)
--   [Build Status](#build-status)
--   [Quickstart](#quickstart)
--   [Project Goals](#project-goals)
--   [Milestones](#milestones)
--   [Status](#status)
--   [Dependencies](#dependencies)
--   [License](#license)
+IREE is still in its early phase. We have settled down on the overarching
+infrastructure and are actively improving various software components as well as
+project logisitics. It is still quite far from ready for everyday use and is
+made available without any support at the moment. With that said, we welcome any
+kind of feedback on any [communication channels](#communication-channels)!
 
 ## Communication Channels
 
 *   [GitHub Issues](https://github.com/google/iree/issues): Preferred for
-    specific issues and coordination on upcoming features.
+    specific technical issues and coordination on upcoming features.
 *   [Google IREE Discord Server](https://discord.gg/26P4xW4): The core team and
-    collaborators hang out here.
+    collaborators coordinate daily development here; good for low-latency
+    communication.
 *   [Google Groups Email List](https://groups.google.com/forum/#!forum/iree-discuss):
-    General, low-priority discussion.
+    Good for general and low-priority discussion.
 
-### Related Project Channels
+#### Related Project Channels
 
 *   [MLIR topic within LLVM Discourse](https://llvm.discourse.group/c/llvm-project/mlir/31):
-    Often, discussions that span IREE and various infrastructure topics will
-    fork into topics on this forum.
+    IREE is enabled by and heavily relies on [MLIR](https://mlir.llvm.org). IREE
+    sometimes is referred to in certain MLIR discussions. Useful if you are also
+    interested in MLIR evolution.
+
+## Getting Started
+
+For development, IREE supports both Bazel and CMake on Windows and Linux. We are
+working on enabling macOS support. For deployment, IREE aims to additionally
+cover Android and iOS.
+
+Please see the [Getting Started](https://google.github.io/iree/GetStarted) pages
+on IREE's [documentation hub](https://google.github.io/iree) to configure,
+compile, and run IREE in your favorite development environment!
+
+## Documentation and Talks
+
+IREE hosts all its documentation and project status dashboards on
+[GitHub Pages](https://google.github.io/iree). We are still building up the
+website; please feel free to
+[create issues](https://github.com/google/iree/issues) for the documentation
+you'd like to see!
+
+We also have some public talks that explain IREE's concepts and architecture:
+
+*   2020-03-18: Interactive HAL IR Walkthrough (Ben Vanik and core team)
+    ([recording](https://drive.google.com/open?id=1FDrW9wvmiCQsVNSNzTD_V0bYmBl0XTyQ))
+*   2020-01-31: End-to-end MLIR Workflow in IREE
+    ([recording](https://drive.google.com/open?id=1os9FaPodPI59uj7JJI3aXnTzkuttuVkR)
+    and
+    [slides](https://drive.google.com/open?id=1RCQ4ZPQFK9cVgu3IH1e5xbrBcqy7d_cEZ578j84OvYI))
+
+## Architecture and Goals
+
+IREE adopts a _holistic_ approach towards ML model compilation: the IR produced
+contains both the _scheduling_ logic, required to communicate data dependencies
+to low-level parallel pipelined hardware/API like Vulkan, and the _execution_
+logic, encoding dense computation on the hardware in the form of
+hardware/API-specific binaries like SPIR-V.
+
+The architecture of IREE is best illustrated by the following picture:
+
+![IREE Architecture](./docs/IREE-Architecture.svg)
+
+Being compilation-based means IREE does not have a traditional runtime that
+dispatches "ops" to their fat kernel implemenations. What IREE provides is a
+toolbox for different deployment scenarios. It scales from running generated
+code on a particular API (such as emitting C code calling external DSP kernels),
+to a HAL (**H**ardware **A**bstraction **L**ayer) that allows the same generated
+code to target multiple APIs (like Vulkan and Direct3D 12), to a full VM
+allowing runtime model loading for flexible deployment options and heterogeneous
+execution.
+
+IREE aims to
+
+*   Support advanced models on mobile/edge devices. Dynamic shapes, dynamic flow
+    control, dynamic multi-model dispatch, streaming models, tree-based search
+    algorithms, and other are all good examples of exciting ML evolution. We are
+    trying to build IREE from the ground-up to enable these models and run them
+    efficiently on modern hardware, especially on mobile/edge devices.
+*   Demonstrate MLIR's ability to develop non-traditional ML compiler backends
+    and runtimes. MLIR enables IREE's holistic approach of focusing on the math
+    being performed and how that math is scheduled rather than graphs of "ops".
+*   Embrace standard-based ML via Vulkan. The graphics world is shifting towards
+    favoring mordern explicit APIs for performance and predictability and Vulkan
+    is emerging as the "compatibility" layer. We would love to allow hardware
+    vendors to be able to make ML efficient on their hardware without the need
+    for bespoke runtimes and special access. We also would love to let
+    developers and users utilize all the hardware available on as many platforms
+    as possible.
+
+## Roadmap and Milestones
+
+IREE is still at its early stage; we have lots of exciting future plans. Please
+check out the [long-term design roadmap](./docs/roadmap_design.md) and
+[short-term focus areas](./docs/roadmap.md).
+
+We use [GitHub Projects](https://github.com/google/iree/projects) to track
+various IREE components and
+[GitHub Milestones](https://github.com/google/iree/milestones) for major
+features and quarterly plans. Please check out for updated information.
 
 ## Build Status
 
@@ -40,224 +116,6 @@
 Kokoro         | Bazel        | Linux    | [![kokoro-status-bazel-linux](https://storage.googleapis.com/iree-oss-build-badges/bazel/build_status_linux.svg)](https://storage.googleapis.com/iree-oss-build-badges/bazel/build_result_linux.html)
 Kokoro         | CMake        | Linux    | [![kokoro-status-cmake-linux](https://storage.googleapis.com/iree-oss-build-badges/cmake/build_status_linux.svg)](https://storage.googleapis.com/iree-oss-build-badges/cmake/build_result_linux.html)
 
-## Quickstart
-
-More Coming soon! Performing full model translation may require a few steps
-(such as ensuring you have a working TensorFlow build), however we'll have
-pre-translated example models that allow independent testing of the runtime
-portions.
-
-*   [Getting Started on Windows](docs/getting_started_on_windows.md)
-*   [Getting Started on Linux](docs/getting_started_on_linux.md)
-*   [Getting Started](docs/getting_started.md)
-
-See also:
-
-*   [Using Colab](docs/using_colab.md)
-*   [Vulkan and SPIR-V](docs/vulkan_and_spirv.md)
-*   [Function ABI](docs/function_abi.md)
-*   [MNIST example on IREE](docs/mnist_example.md)
-*   [Details of IREE Repository Management](docs/repository_management.md)
-
-## Talks
-
-We occasionally have either productive, recorded meetings or talks and will post
-them here.
-
-*   [March 18, 2020: Interactive HAL IR Walkthrough (Ben Vanik and core team)](https://drive.google.com/open?id=1FDrW9wvmiCQsVNSNzTD_V0bYmBl0XTyQ)
-*   [Jan 31, 2020: IREE Presentation at MLIR ODM](https://drive.google.com/open?id=1os9FaPodPI59uj7JJI3aXnTzkuttuVkR)
-    ([slides](https://drive.google.com/open?id=1RCQ4ZPQFK9cVgu3IH1e5xbrBcqy7d_cEZ578j84OvYI))
-
-## Project Goals
-
-IREE (**I**ntermediate **R**epresentation **E**xecution **E**nvironment,
-pronounced as "eerie") is an experimental compiler backend for
-[MLIR](https://mlir.llvm.org/) that lowers ML models to an IR that is optimized
-for real-time mobile/edge inference against heterogeneous hardware accelerators.
-
-The IR produced contains the sequencing information required to communicate
-pipelined data dependencies and parallelism to low-level hardware APIs like
-Vulkan and embed hardware/API-specific binaries such as SPIR-V or compiled ARM
-code. As the IR is specified against an abstract execution environment there are
-many potential ways to run a compiled model, and one such way is included as an
-example and testbed for runtime optimization experiments.
-
-The included layered runtime scales from generated code for a particular API
-(such as emitting C code calling external DSP kernels), to a HAL (**H**ardware
-**A**bstraction **L**ayer) that allows the same generated code to target
-multiple APIs (like Vulkan and Direct3D 12), to a full VM allowing runtime model
-loading for flexible deployment options and heterogeneous execution. Consider
-both the compiler and the included runtime a toolbox for making it easier - via
-the versatility of MLIR - to take ML models from their source to some varying
-degree of integration with your application.
-
-### Demonstrate MLIR
-
-IREE has been developed alongside MLIR and is used as an example of how
-non-traditional ML compiler backends and runtimes can be built: it focuses more
-on the math being performed and how that math is scheduled rather than graphs of
-"ops" and in some cases allows doing away with a runtime entirely. It seeks to
-show how more holistic approaches that exploit the MLIR framework and its
-various dialects can be both easy to understand and powerful in the
-optimizations to code size, runtime complexity, and performance they enable.
-
-### Demonstrate Advanced Models
-
-By using models with much greater complexity than the usual references (such as
-MobileNet) we want to show how weird things can get when model authors are
-allowed to get creative: dynamic shapes, dynamic flow control, dynamic
-multi-model dispatch (including models that conditionally dispatch other
-models), streaming models, tree-based search algorithms, etc. We are trying to
-build IREE from the ground-up to enable these models and run them efficiently on
-modern hardware. Many of our example models are sequence-to-sequence language
-models from the [Lingvo](https://github.com/tensorflow/lingvo) project
-representing cutting edge speech recognition and translation work.
-
-### Demonstrate ML-as-a-Game-Engine
-
-An observation that has driven the development of IREE is one of ML workloads
-not being much different than traditional game rendering workloads: math is
-performed on buffers with varying levels of concurrency and ordering in a
-pipelined fashion against accelerators designed to make such operations fast. In
-fact, most ML is performed on the same hardware that was designed for games! Our
-approach is to use the compiler to transform ML workloads to ones that look
-eerily _(pun intended)_ similar to what a game performs in per-frame render
-workloads, optimize for low-latency and predictable execution, and integrate
-well into existing systems both for batched and interactive usage. The IREE
-runtime is designed to feel more like game engine middleware than a standalone
-ML inference system, though we still have much work towards that goal. This
-should make it easy to use existing tools for high-performance/low-power
-optimization of GPU workloads, identify driver or system issues introducing
-latency, and help to improve the ecosystem overall.
-
-### Demonstrate Standards-based ML via Vulkan and SPIR-V
-
-With the above observation that ML can look like games from the systems
-perspective it follows that APIs and technologies good for games should probably
-also be good for ML. In many cases we've identified only a few key differences
-that exist and just as extensions have been added and API improvements have been
-made to graphics/compute standards for decades we hope to demonstrate and
-evaluate small, tactical changes that can have large impacts on ML performance
-through these standard APIs. We would love to allow hardware vendors to be able
-to make ML efficient on their hardware without the need for bespoke runtimes and
-special access such that any ML workload produced by any tool runs well. We'd
-consider the IREE experiment a success if what resulted was some worked examples
-that help advance the entire ecosystem!
-
-## Milestones
-
-We are currently just at the starting line, with basic
-[MNIST MLP](https://github.com/keras-team/keras/blob/master/examples/mnist_mlp.py)
-running end-to-end on both a CPU interpreter and Vulkan. As we scale out the
-compiler we will be increasing the complexity of the models that can run and
-demonstrating more of the optimizations we've found useful in previous efforts
-to run them efficiently on edge devices.
-
-A short-term
-[Roadmap](https://github.com/google/iree/blob/master/docs/roadmap.md) is
-available talking about the major areas where are focusing on in addition to the
-more infrastructure-focused work listed below.
-
-We'll be setting up GitHub milestones with issues tracking major feature work we
-are planning. For now, our areas of work are:
-
-*   Allocation tracking and aliasing in the compiler
-*   Pipelined scheduler in the VM for issuing proper command buffers
-*   New CPU interpreter that enables lightweight execution on ARM and x86
-*   C code generator and API to demonstrate "runtimeless" mode
-*   Quantization using the MLIR quantization framework
-*   High-level integration and examples when working with TensorFlow 2.0
-*   Switching from IREE's XLA-to-SPIR-V backend to the general MLIR SPIR-V
-    backend
-
-Things we are interested in but don't yet have in-progress:
-
-*   Ahead-of-time compiled ARM NEON backend (perhaps via
-    [SPIRV-LLVM](https://github.com/KhronosGroup/SPIRV-LLVM-Translator/),
-    [SPIRV-to-ISPC](https://github.com/GameTechDev/SPIRV-Cross), or some other
-    technique)
-*   HAL backends for Metal 2 and Direct3D 12
-*   Profile-guided optimization support for scheduling feedback
-
-## Current Status
-
-### Documentation
-
-Coming soon :)
-
-### Build System and CI
-
-*   We support Bazel for builds of all parts of the project.
-*   We also maintain a CMake build for a subset of runtime components designed
-    to be used in other systems.
-
-### Code and Style
-
-The project is currently _very_ early and a mix of code written prior to a lot
-of the more recent ergonomics improvements in MLIR and its tablegen. Future
-changes will replace the legacy code style with prettier forms and simplify the
-project structure to make it easier to separate the different components. Some
-entire portions of the code (such as the CPU interpreter) will likely be dropped
-or rewritten. For now, assume churn!
-
-The compiler portions of the code (almost exclusively under `iree/compiler/`)
-follows the LLVM style guide and has the same system requirements as MLIR
-itself. It general requires a more modern C++ compiler.
-
-The runtime portions vary but most are designed to work with C++11 and use
-[Abseil](https://github.com/abseil/abseil-cpp) to bring in future C++14 and
-C++17 features.
-
-### Hardware Support
-
-We are mostly targeting Vulkan and Metal on recent mobile devices and as such
-have limited our usage of hardware features and vendor extensions to those we
-have broad access to there. This is mainly just to keep our focus tight and does
-not preclude usage of features outside the standard sets or for other hardware
-types (in fact, we have a lot of fun ideas for
-`VK_NVX_device_generated_commands` and Metal 2.1's Indirect Command Buffers!).
-
-## Dependencies
-
-NOTE: during the initial open source release we are still cleaning things up. If
-there's weird dependencies/layering that makes life difficult for your
-particular use case please file an issue so we can make sure to fix it.
-
-### Compiler
-
-The compiler has several layers that allow scaling the dependencies required
-based on the source and target formats. In all cases
-[MLIR](https://mlir.llvm.org/) is required and for models not originating from
-TensorFlow (or already in XLA HLO format) it is the only dependency. When
-targeting the IREE Runtime VM and HAL
-[FlatBuffers](https://google.github.io/flatbuffers/) is required for
-serialization. Converting from TensorFlow models requires a dependency on
-TensorFlow (however only those parts required for conversion).
-
-### Runtime VM
-
-The VM providing dynamic model deployment and advanced scheduling behavior
-requires [Abseil](https://github.com/abseil/abseil-cpp) for its common types,
-however contributions are welcome to make it possible to replace Abseil with
-other libraries via shims/forwarding. The core types used by the runtime
-(excluding command line flags and such in tools) are limited to types coming in
-future C++ versions (variant, optional, string_view, etc), cheap types
-(absl::Span), or simple standard containers (absl::InlinedVector).
-[FlatBuffers](https://google.github.io/flatbuffers/) is used to load compiled
-modules.
-
-### Runtime HAL
-
-The HAL and the provided implementations (Vulkan, etc) also use
-[Abseil](https://github.com/abseil/abseil-cpp). Contributions are welcome to
-allow other types to be swapped in. A C99 HAL API is planned for code generation
-targets that will use no dependencies.
-
-### Testing and Tooling
-
-[Swiftshader](https://github.com/google/swiftshader) is used to provide fast
-hardware-independent testing of the Vulkan and SPIR-V portions of the toolchain.
-
 ## License
 
 IREE is licensed under the terms of the Apache license. See [LICENSE](LICENSE)
commit	c4cfa7986b36d91d9c79c9d000299d369e74c6b6	[log] [tgz]
author	Lei Zhang <antiagainst@google.com>	Wed Apr 08 14:55:41 2020 -0700
committer	Copybara-Service <copybara-worker@google.com>	Wed Apr 08 14:56:58 2020 -0700
tree	09d33134720c4ff23eac65c42e3487896bc7d502
parent	fdd98f3c633e06565546c7d7c2a4a8005da4d098 [diff] [blame]