Enabling linking in the ROCM/CUDA compiler targets. (#18936)
This does exactly what the LLVMCPU side does - which is bad for compile
time (serializes LLVM codegen) but much better for runtime. Future
improvements should move LLVM codegen to the linking phase so it can
happen in parallel and then perform the linking using LLVM's linker
(each executable turned into a .o and then combined into a .so, or
last-level bitcode if then we just want serialization to be bitcode to
machine code). This is definitely a compile-time regression but we can't
keep pessimizing runtime.
diff --git a/experimental/web/sample_static/device_sync.c b/experimental/web/sample_static/device_sync.c
index 3fbe3ee..f072903 100644
--- a/experimental/web/sample_static/device_sync.c
+++ b/experimental/web/sample_static/device_sync.c
@@ -15,7 +15,7 @@
// Register the statically linked executable library.
const iree_hal_executable_library_query_fn_t libraries[] = {
- mnist_linked_llvm_cpu_library_query,
+ mnist_linked_library_query,
};
iree_hal_executable_loader_t* library_loader = NULL;
iree_status_t status = iree_hal_static_library_loader_create(