commit | 9ba88a2089c5de955df1680e39ced9ff7adf90d1 | [log] [tgz] |
---|---|---|
author | Ben Vanik <ben.vanik@gmail.com> | Wed May 17 15:51:08 2023 -0700 |
committer | GitHub <noreply@github.com> | Wed May 17 15:51:08 2023 -0700 |
tree | 5693e6eec36a8564bb634e60487c2377294a1c85 | |
parent | 114e5cc9bb77540b2fe664f59e22052416a35e00 [diff] |
Disabling detensoring by default until #6948 can be completed. (#13658) Detensoring is useful in small scale (a single while-loop with a tensor<i64> iterator like from JAX) but results in pathologically bad device<->host behavior in all other cases. Without improvements to dispatch region formation for small scalar workloads or a retensoring pass that eliminates the device<->host transfers we can't have it on by default. This makes HLO while loops less efficient as they represent the loop iterator and condition as tensor operations that get performed on device even though the loop happens on the host. It makes everything else significantly better, though, and in the LLM modules under inspection in #13637 reduces the number of host<->device round trips from ~500-1000 (depending on model) to ~3 (all while loop math). We could investigate an HLO-level specialized detensoring of while loops until we can fix #6948. WebGPU's SPIR-V -> WGSL lowering currently has issues with non-detensored comparisons (#12509) and those `while.mlir` tests have been disabled.
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.