CPU ukernels as bitcode (x86-only for now) (#13460)
While #13433 enabled the use of micro kernels within codegen backends
using the plugin mechanism, here the ukernel code is compiled into a
bitcode library. This bitcode library is linked with the generated code
at compilation time. The lowering to LLVM inlines the exact micro kernel
needed from the micro kernel library for a particular architecture. To
enable this end-to-end flow, the following changes are needed
- Add an enum attribute to HAL : `IREE::HAL::CallingConvention` that
allows specifying what calling convention to use for the micro kernel
call. `Default` leaves the params as is, `ParameterStruct` packs all the
returns and arguments into a parameter struct to mimic ABIs like
https://github.com/openxla/iree/blob/6cf092d022810d4347353b23e5ce2688a166dd67/runtime/src/iree/builtins/ukernel/mmt4d.h#L16
- Couple of patterns are added to `ConvertToLLVM` pass, to handle the
lowering of the function definition and function call, in keeping with
the specified ABI
- Allow specification of `hal.import.fields` to specify `processor_data`
and `processor_id` on ukernel function defn. This then generates the
code to forward this information to the microkernels (similar to what is
done for external calls using the plugin mechanism)
- Propagate the target CPU features in `hal.executable.target` of the
dispatch into the micro kernel call. This allows the LLVM passes to walk
through the branching used to pick the right micro kernel function and
effectively inline that.
Co-authored-by: Benoit Jacob <benoitjacob@google.com>
diff --git a/build_tools/bazel/iree_bitcode_library.bzl b/build_tools/bazel/iree_bitcode_library.bzl
index a38fa45..b4f9b38 100644
--- a/build_tools/bazel/iree_bitcode_library.bzl
+++ b/build_tools/bazel/iree_bitcode_library.bzl
@@ -40,23 +40,25 @@
for bitcode_src in srcs:
bitcode_out = "%s_%s.bc" % (name, bitcode_src)
bitcode_files.append(bitcode_out)
+ system_headers = ["immintrin.h"]
native.genrule(
name = "gen_%s" % (bitcode_out),
- srcs = [bitcode_src],
+ srcs = [bitcode_src] + hdrs + [builtin_headers_dep],
outs = [bitcode_out],
cmd = " && ".join([
" ".join([
"$(location %s)" % (clang_tool),
- "-isystem $(BINDIR)/%s" % (builtin_headers_path),
+ "-isystem $(BINDIR)/%s" % builtin_headers_path,
" ".join(copts),
" ".join(["-D%s" % (define) for define in defines]),
+ " ".join(["-I $(BINDIR)/runtime/src"]),
+ " ".join(["-I runtime/src"]),
"-o $(location %s)" % (bitcode_out),
"$(location %s)" % (bitcode_src),
]),
]),
- tools = hdrs + data + [
+ tools = data + [
clang_tool,
- builtin_headers_dep,
],
message = "Compiling %s to %s..." % (bitcode_src, bitcode_out),
output_to_bindir = 1,
@@ -81,3 +83,40 @@
output_to_bindir = 1,
**kwargs
)
+
+def iree_link_bitcode(
+ name,
+ bitcode_files,
+ out = None,
+ link_tool = "@llvm-project//llvm:llvm-link",
+ **kwargs):
+ """Builds an LLVM bitcode library from an input file via clang.
+
+ Args:
+ name: Name of the target.
+ bitcode_files: bitcode files to link together.
+ out: output file name (defaults to name.bc).
+ link_tool: llvm-link tool used for linking bitcode files.
+ **kwargs: any additional attributes to pass to the underlying rules.
+ """
+
+ bitcode_files_qualified = [(("//" + native.package_name() + "/" + b) if b.count(":") else b) for b in bitcode_files]
+
+ if not out:
+ out = "%s.bc" % (name)
+ native.genrule(
+ name = name,
+ srcs = bitcode_files_qualified,
+ outs = [out],
+ cmd = " && ".join([
+ " ".join([
+ "$(location %s)" % (link_tool),
+ "-o $(location %s)" % (out),
+ " ".join(["$(locations %s)" % (src) for src in bitcode_files_qualified]),
+ ]),
+ ]),
+ tools = [link_tool],
+ message = "Linking bitcode library %s to %s..." % (name, out),
+ output_to_bindir = 1,
+ **kwargs
+ )