commit | 29647b35df4eb71919ee160ed53d64fe6b2ab1ec | [log] [tgz] |
---|---|---|
author | MaheshRavishankar <1663364+MaheshRavishankar@users.noreply.github.com> | Fri May 19 10:54:42 2023 -0700 |
committer | GitHub <noreply@github.com> | Fri May 19 13:54:42 2023 -0400 |
tree | a31c75e4087d7e61934c010fa941961ce44ee79d | |
parent | aa28b4a786d137716866239e3b98d3a2631782e3 [diff] |
CPU ukernels as bitcode (x86-only for now) (#13460) While #13433 enabled the use of micro kernels within codegen backends using the plugin mechanism, here the ukernel code is compiled into a bitcode library. This bitcode library is linked with the generated code at compilation time. The lowering to LLVM inlines the exact micro kernel needed from the micro kernel library for a particular architecture. To enable this end-to-end flow, the following changes are needed - Add an enum attribute to HAL : `IREE::HAL::CallingConvention` that allows specifying what calling convention to use for the micro kernel call. `Default` leaves the params as is, `ParameterStruct` packs all the returns and arguments into a parameter struct to mimic ABIs like https://github.com/openxla/iree/blob/6cf092d022810d4347353b23e5ce2688a166dd67/runtime/src/iree/builtins/ukernel/mmt4d.h#L16 - Couple of patterns are added to `ConvertToLLVM` pass, to handle the lowering of the function definition and function call, in keeping with the specified ABI - Allow specification of `hal.import.fields` to specify `processor_data` and `processor_id` on ukernel function defn. This then generates the code to forward this information to the microkernels (similar to what is done for external calls using the plugin mechanism) - Propagate the target CPU features in `hal.executable.target` of the dispatch into the micro kernel call. This allows the LLVM passes to walk through the branching used to pick the right micro kernel function and effectively inline that. Co-authored-by: Benoit Jacob <benoitjacob@google.com>
IREE (Intermediate Representation Execution Environment, pronounced as “eerie”) is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
IREE is still in its early phase. We have settled down on the overarching infrastructure and are actively improving various software components as well as project logistics. It is still quite far from ready for everyday use and is made available without any support at the moment. With that said, we welcome any kind of feedback on any communication channels!
See our website for more information.
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.