Batching parameter load operations and cleaning up gather/scatter. (#15706) This makes loads look like gathers/scatters and allows us to move the (relatively) tricky concurrency scheduling logic to the runtime. A single load operation can now return any number of parameters with unique storage buffers (hopefully imported/zero-copy) so long as they have matching buffer parameters (of which all in general do). The core logic for scheduling the batched operations has been shared such that load/gather/scatter are all going down the same path meaning that as we add new parameter types and optimize scheduling we only have one code path to tweak. Some minor optimizations have been done to elide batch overhead but many have been deferred as compared to staging even 10MB of parameters the current profile is in the noise. The standalone read/write methods were removed to simplify the compiler<->runtime interface and implementations of `iree_io_parameter_provider_t` - currently the only overhead incurred is an additional queue join barrier that we can optimize away in the future in most cases. Since load/gather use the same code path now we shouldn't have correctness issues unique to any particular path and can turn back on the gather path which has much less overhead in the compiler/vmfb that otherwise needs to handle independent buffers per parameter. We can eventually optimize the load path to batch device buffer allocations but the compiler/vmfb still needs to treat each as independent and we won't get savings there. The rule is that the unified memory model should only be used when building a vmfb that targets devices that can do zero-copy loads from memory mapped files - every other case should use discrete. Progress on #15521. Progress on #15522. Works around several issues in #15674.
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.