CPU ukernels as bitcode (x86-only for now) (#13460)

While #13433 enabled the use of micro kernels within codegen backends
using the plugin mechanism, here the ukernel code is compiled into a
bitcode library. This bitcode library is linked with the generated code
at compilation time. The lowering to LLVM inlines the exact micro kernel
needed from the micro kernel library for a particular architecture. To
enable this end-to-end flow, the following changes are needed

- Add an enum attribute to HAL : `IREE::HAL::CallingConvention` that
allows specifying what calling convention to use for the micro kernel
call. `Default` leaves the params as is, `ParameterStruct` packs all the
returns and arguments into a parameter struct to mimic ABIs like
https://github.com/openxla/iree/blob/6cf092d022810d4347353b23e5ce2688a166dd67/runtime/src/iree/builtins/ukernel/mmt4d.h#L16
- Couple of patterns are added to `ConvertToLLVM` pass, to handle the
lowering of the function definition and function call, in keeping with
the specified ABI
- Allow specification of `hal.import.fields` to specify `processor_data`
and `processor_id` on ukernel function defn. This then generates the
code to forward this information to the microkernels (similar to what is
done for external calls using the plugin mechanism)
- Propagate the target CPU features in `hal.executable.target` of the
dispatch into the micro kernel call. This allows the LLVM passes to walk
through the branching used to pick the right micro kernel function and
effectively inline that.

Co-authored-by: Benoit Jacob <benoitjacob@google.com>
diff --git a/build_tools/bazel/iree_bitcode_library.bzl b/build_tools/bazel/iree_bitcode_library.bzl
index a38fa45..b4f9b38 100644
--- a/build_tools/bazel/iree_bitcode_library.bzl
+++ b/build_tools/bazel/iree_bitcode_library.bzl
@@ -40,23 +40,25 @@
     for bitcode_src in srcs:
         bitcode_out = "%s_%s.bc" % (name, bitcode_src)
         bitcode_files.append(bitcode_out)
+        system_headers = ["immintrin.h"]
         native.genrule(
             name = "gen_%s" % (bitcode_out),
-            srcs = [bitcode_src],
+            srcs = [bitcode_src] + hdrs + [builtin_headers_dep],
             outs = [bitcode_out],
             cmd = " && ".join([
                 " ".join([
                     "$(location %s)" % (clang_tool),
-                    "-isystem $(BINDIR)/%s" % (builtin_headers_path),
+                    "-isystem $(BINDIR)/%s" % builtin_headers_path,
                     " ".join(copts),
                     " ".join(["-D%s" % (define) for define in defines]),
+                    " ".join(["-I $(BINDIR)/runtime/src"]),
+                    " ".join(["-I runtime/src"]),
                     "-o $(location %s)" % (bitcode_out),
                     "$(location %s)" % (bitcode_src),
                 ]),
             ]),
-            tools = hdrs + data + [
+            tools = data + [
                 clang_tool,
-                builtin_headers_dep,
             ],
             message = "Compiling %s to %s..." % (bitcode_src, bitcode_out),
             output_to_bindir = 1,
@@ -81,3 +83,40 @@
         output_to_bindir = 1,
         **kwargs
     )
+
+def iree_link_bitcode(
+        name,
+        bitcode_files,
+        out = None,
+        link_tool = "@llvm-project//llvm:llvm-link",
+        **kwargs):
+    """Builds an LLVM bitcode library from an input file via clang.
+
+    Args:
+        name: Name of the target.
+        bitcode_files: bitcode files to link together.
+        out: output file name (defaults to name.bc).
+        link_tool: llvm-link tool used for linking bitcode files.
+        **kwargs: any additional attributes to pass to the underlying rules.
+    """
+
+    bitcode_files_qualified = [(("//" + native.package_name() + "/" + b) if b.count(":") else b) for b in bitcode_files]
+
+    if not out:
+        out = "%s.bc" % (name)
+    native.genrule(
+        name = name,
+        srcs = bitcode_files_qualified,
+        outs = [out],
+        cmd = " && ".join([
+            " ".join([
+                "$(location %s)" % (link_tool),
+                "-o $(location %s)" % (out),
+                " ".join(["$(locations %s)" % (src) for src in bitcode_files_qualified]),
+            ]),
+        ]),
+        tools = [link_tool],
+        message = "Linking bitcode library %s to %s..." % (name, out),
+        output_to_bindir = 1,
+        **kwargs
+    )