commit | 76bf2f3d4460ffe482b7fcb379743b0c890d1571 | [log] [tgz] |
---|---|---|
author | Ben Vanik <ben.vanik@gmail.com> | Tue Aug 02 13:17:18 2022 -0700 |
committer | GitHub <noreply@github.com> | Tue Aug 02 13:17:18 2022 -0700 |
tree | aa9e2381a9a7945e66ce3ca86554bff43b94226d | |
parent | d46c881b25be32cad3c505798dc7f5680a3f2240 [diff] | |
parent | b1688f4991a9e22ac024902ac35ea737efeb03d3 [diff] |
Merge pull request #9754 from iree-org/benvanik-timepoint-to-hal Adding compiler/runtime support for lowering the asynchronous stream dialect ops into HAL ops, materializing a timeline (today just one but multiple in the future), and passing through to the runtime HAL module. This allowed for the removal of the existing placeholder submit_and_wait op and enables queue-ordered allocations to be implemented in the HAL. This is likely not the final design but unblocks work on coroutines, queue-ordered allocations, webgpu, and plumbing fences through the user-facing API/native ABI. Future refinements may create overrides that use semaphores instead of fences to avoid fence heap allocations when not required, but for most single-function classic ML models once we plumb fences through the ABI no internal fences are required. The current timeline materialization also strictly orders all invocations where instead we should be able to elide those when there's no internal program state to protect. Because the various HAL backends all need work (CUDA/ROCM in particular need massive work) nearly everything is synchronized exactly as it was before but now that synchronization happens in the IR we emit and we can selectively start supporting async per target. Progress on #1285 (just need to put fences on the ABI!). Progress on #8093 (added yieldable fence waits). Progress on #9572 (added compiler/runtime glue for queue-ordered allocs).
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.