- 8adae37 [cuda][hip] Add support for semaphore multi wait (#16638) by Lei Zhang · 1 year, 2 months ago
- 9d6d99f faster narrow mmt4d ukernels on x86 (#16655) by Benoit Jacob · 1 year, 2 months ago
- 4f1f055 mmt4d ukernel: use fewer magic macros to generate tile-functions M0-variants (#16645) by Benoit Jacob · 1 year, 2 months ago
- b994b72 Reenable accidentally disabled architecture-specific parts of `mmt4d_test` (#16654) by Benoit Jacob · 1 year, 2 months ago
- f433fd2 Using iree.abi.name consistently for arg/result names. (#16635) by Ben Vanik · 1 year, 2 months ago
- fe5e69a [cuda][hip] Shorten deferred queue worker name (#16642) by Lei Zhang · 1 year, 2 months ago
- 9dfc612 [cuda][hip] Fix worker thread and device host callback synchronization (#16621) by Boian Petkantchin · 1 year, 2 months ago
- f66d7f2 Fix enablement of mmt4d ukernel test cases based on ISA code paths built (#16637) by Benoit Jacob · 1 year, 2 months ago
- 5180ede mmt4d ukernel: simplification in generic tile funcs: stop using a stack array (#16633) by Benoit Jacob · 1 year, 2 months ago
- 8959b90 Make ukernels fallback opt-in and add a `mmt4d_info` ukernel to query the mmt4d implementation. (#16631) by Benoit Jacob · 1 year, 2 months ago
- 6ff9a3d Refactor how llvm-cpu check tests interface with ASan/TSan. (#16452) by Scott Todd · 1 year, 2 months ago
- e6397cb Change ukernels calling convention to default (#16541) by Benoit Jacob · 1 year, 2 months ago
- e991798 Unroll fixed-trip-count loops within mmt4d ukernel tile functions. (#16626) by Benoit Jacob · 1 year, 2 months ago
- 88b1d4d Replace std::iterator with our custom iterator typedefs (#16423) (#16583) by Peyman Barazandeh · 1 year, 2 months ago
- 9dc8ae4 [cuda][hip] Fix launch host func and worker thread state update (#16568) by Lei Zhang · 1 year, 2 months ago
- 862a031 Adding --task_abort_on_failure flag/API. (#16565) by Ben Vanik · 1 year, 2 months ago
- 23f2828 Adding iree-benchmark-executable tool. (#16550) by Ben Vanik · 1 year, 2 months ago
- c15b610 [EmitC] Remove the forked emitter and generate all the code in the conversion pass (#16357) by Simon Camphausen · 1 year, 2 months ago
- d500494 Add s8s4s32 dotprod microkernel (#16473) by mariecwhite · 1 year, 2 months ago
- c3b3d96 Adding hal.device.id queries to HAL devices. (#16495) by Ben Vanik · 1 year, 2 months ago
- 6d293af Retrying try-lock in synchronization_test to avoid arm64 flakes. (#16436) by Ben Vanik · 1 year, 2 months ago
- 4463f8d [python] Enable building of 3.12 wheels on Linux. (#16424) by Stella Laurenzo · 1 year, 2 months ago
- 1f3e907 ukernels: update README.md (#16358) by Benoit Jacob · 1 year, 2 months ago
- d1e1d05 [python] Add a couple more async APIs. (#16419) by Stella Laurenzo · 1 year, 2 months ago
- 00aa173 [hip] Add missing source locations and fix parsing (#16418) by Lei Zhang · 1 year, 2 months ago
- d32609e Add s8s4s32 ukernel for ARM (#16259) by mariecwhite · 1 year, 2 months ago
- c02b89e [cuda][hip] Guard against NULL cleanup callbacks (#16403) by Lei Zhang · 1 year, 2 months ago
- 7c2ec73 Fix a bug in the fastpath of iree_hal_task_semaphore_multi_wait which was doing a spurious wait. (#16404) by Stella Laurenzo · 1 year, 2 months ago
- 60ac333 [python] Add a HalDeviceLoop class for routing runtime events to futures. (#16385) by Stella Laurenzo · 1 year, 2 months ago
- c70bf22 [HAL] Remove pool assert during allocator creation (#16388) by Nithin Meganathan · 1 year, 2 months ago
- 14927d1 Replacing the ancient vm_util with function_io/function_util. (#16351) by Ben Vanik · 1 year, 3 months ago
- 9aabcb3 Add conversions for FP8 types (F8E5M2 and F8E4M3) (#16374) by Benoit Jacob · 1 year, 3 months ago
- 30901f5 Replacing the ancient vm_util with function_io/function_util. by Ben Vanik · 1 year, 3 months ago
- 49f8a61 Adding iree_io_vec_stream_t. by Ben Vanik · 1 year, 3 months ago
- 29a7462 Adding iree_io_stdio_stream_t. by Ben Vanik · 1 year, 3 months ago
- 0a2483a Splitting iree_io_memory_stream_t from iree/io/stream.h. by Ben Vanik · 1 year, 3 months ago
- 9234f42 Add a number of runtime python bindings and refine the HalFence.wait() behavior. (#16371) by Stella Laurenzo · 1 year, 3 months ago
- 87bf971 Fixing implicit casting that caused 4GB fill/copy limits in local-task. (#16364) by Ben Vanik · 1 year, 3 months ago
- 10fd98b Fixes to enable clang-cl compilation of compiler/runtime. (#16299) by Ben Vanik · 1 year, 3 months ago
- 065e04a Adding support for outputting binary files from tooling. (#16291) by Ben Vanik · 1 year, 3 months ago
- 406626b [Vulkan][SPIRV] Introduce `address` vulkan device property (#16282) by Jakub Kuderski · 1 year, 3 months ago
- ef79e51 [doc] Add README in CUDA and Metal HAL drivers directory (#16275) by Lei Zhang · 1 year, 3 months ago
- a8c1e17 [doc] Expose CUDA and Metal HAL driver doc to the website (#16256) by Lei Zhang · 1 year, 3 months ago
- 5d4c0ba Nothing is unreachable (#16261) by Benoit Jacob · 1 year, 3 months ago
- 1c83020 [doc] Update docs about the CUDA HAL driver (#16234) by Lei Zhang · 1 year, 3 months ago
- 41583ca [cuda] NFC: rename cuda2 to cuda (#16232) by Lei Zhang · 1 year, 3 months ago
- cfe865f [cuda] Drop cuda1 HAL implementation code (#16188) by Lei Zhang · 1 year, 3 months ago
- a724043 Adding util inlining policy attr interface and always/never attrs. by Ben Vanik · 1 year, 3 months ago
- 37e65de Freeze all statuses emitted by calls into dynamic modules (#16066) by Quinn Dawkins · 1 year, 3 months ago
- 3b3cef9 [cuda] Switch cuda2 on and cuda1 off by default (#16107) by Lei Zhang · 1 year, 3 months ago
- a2df2cc [cuda] Collect tracing events after command buffer completion (#16158) by Lei Zhang · 1 year, 3 months ago
- dd787a7 Removing trace_replay infra and the libyaml dependency. by Ben Vanik · 1 year, 3 months ago
- c0379f3 Removing python binding support for tracing. by Ben Vanik · 1 year, 3 months ago
- ac94b2a Removing iree-run-trace/iree-benchmark-trace tools. by Ben Vanik · 1 year, 3 months ago
- 7d736b5 Ukernels: simplify the architecture-specific bitcode build. (#16126) by Benoit Jacob · 1 year, 3 months ago
- 91803de Allow specifying multiple --device= flags in tooling. (#16132) by Ben Vanik · 1 year, 3 months ago
- 4b1b8e2 Fixing task worker utilization tracing plot. (#16131) by Ben Vanik · 1 year, 3 months ago
- e3db254 Simplify how mmt4d ukernels deal with the K=0 case. (#16137) by Benoit Jacob · 1 year, 3 months ago
- 17e9529 [spirv][vulkan] Refine device query to be more descriptive (#16101) by Lei Zhang · 1 year, 4 months ago
- 869e505 Disable const-eval for parameters unit test (#16089) by Max191 · 1 year, 4 months ago
- 171e31c [cuda] Move to hal/drivers and wire up BUILD files (#14620) by Lei Zhang · 1 year, 4 months ago
- a7a7ad6 Add vm.buffer.hash and util.buffer.hash ops (#16003) by Quinn Dawkins · 1 year, 4 months ago
- c8ecc1c Reland "[spirv][vulkan] Enable device query generation and execution" (#16075) by Lei Zhang · 1 year, 4 months ago
- 282ab77 Revert "[spirv][vulkan] Enable device query generation and execution" (#16077) by Han-Chung Wang · 1 year, 4 months ago
- 852684a [spirv][vulkan] Enable device query generation and execution (#15977) by Lei Zhang · 1 year, 4 months ago
- b55ba25 Fixing/silencing some warnings that have crept in over time. (#16072) by Ben Vanik · 1 year, 4 months ago
- d21a99c Check for source location resolution function in dynamic modules (#16065) by Quinn Dawkins · 1 year, 4 months ago
- d6dad12 ukernel: unroll the s16u4 VNNI ukernel, and drop the unused N0=16 variant (#16047) by Benoit Jacob · 1 year, 4 months ago
- c35d8e9 Standardizes CMake setup of C directory trees behind a macro. (#16011) by Stella Laurenzo · 1 year, 4 months ago
- 15c306f Build functioning dev packages for IREECompiler and IREERuntime. (#16008) by Stella Laurenzo · 1 year, 4 months ago
- 92df2b4 Make iree.compiler.api.Output.map_memory() retain its backing reference. (#15975) by Stella Laurenzo · 1 year, 4 months ago
- f97aa4d [HIP] Adds support for native executable and cache (#15937) by Nithin Meganathan · 1 year, 4 months ago
- 46d9347 Tweaks to e2e matmul tests (#15930) by bjacob · 1 year, 5 months ago
- f81f361 Removing transfer_range from the HAL device vtable. (#15919) by Ben Vanik · 1 year, 5 months ago
- 80e70ca Replacing hal.ex.shared_device with hal.devices.* ops. (#15916) by Ben Vanik · 1 year, 5 months ago
- 605aca9 [CPU][ArmSME] Add (initial) tiling and lowering pipeline for ArmSME (#15794) by Benjamin Maxwell · 1 year, 5 months ago
- 62c4f98 Replacing cpuinfo on Mac and adding support for E/P cores. (#15891) by Ben Vanik · 1 year, 5 months ago
- fbbccde Moving memory info queries to iree/base/internal/memory.h. (#15882) by Ben Vanik · 1 year, 5 months ago
- 8d01698 [vulkan] Enable initial executable linking (#15802) by Lei Zhang · 1 year, 5 months ago
- 69398bc [python] Add python bindings for creating IRPA files. (#15868) by Stella Laurenzo · 1 year, 5 months ago
- 51a9225 Cleanup parts of the Bazel build and document usage. (#15727) by Scott Todd · 1 year, 5 months ago
- 38878ee Relax NCCL version constraints (#14633) by Boian Petkantchin · 1 year, 5 months ago
- 31a5dcb Fix embedded builds of IREE. (#15761) (#15768) by Thomas Preud'homme · 1 year, 5 months ago
- ac9d6d5 Adding optional exported function declaration string to bytecode modules. (#15782) by Ben Vanik · 1 year, 5 months ago
- 888040b Make metal depends on flags. (#15762) by Rechie Kho · 1 year, 5 months ago
- 3266925 ukernel: add support for s16xs8 data types for ukernel (#15771) by Lun Dong · 1 year, 5 months ago
- 4973ef2 Batching parameter load operations and cleaning up gather/scatter. (#15706) by Ben Vanik · 1 year, 5 months ago
- fce839f Adding IREE parameter archive format and tooling support. (#15670) by Ben Vanik · 1 year, 5 months ago
- 2bb8019 Adding iree_io_stream_t and memory stream implementation. (#15668) by Ben Vanik · 1 year, 5 months ago
- dc6f0cd Adding multiple_modules sample (and fixing bugs). (#15653) by Ben Vanik · 1 year, 5 months ago
- f83ca74 Add newline in parameter help print (#15647) by Quinn Dawkins · 1 year, 5 months ago
- 5b2cb64 Fix intermittent failure - functions with `_try_` in their name may fail spuriously. (#15636) by bjacob · 1 year, 5 months ago
- 916dae9 Removing io_parameters.read/write in favor of gather/scatter. (#15607) by Ben Vanik · 1 year, 5 months ago
- 1012586 Fix MSVC build by disabling AVX-512-BF16 in non-latest MSVC versions. (#15589) by bjacob · 1 year, 6 months ago
- 0908ff8 ukernels: add `bf16 * bf16 -> bf16` optimized tile functions for x86 and arm64. (#15543) by bjacob · 1 year, 6 months ago
- 16e4346 ukernel test improvements (#15542) by bjacob · 1 year, 6 months ago
- 9393f94 Avoid stack allocation for VM->HAL iree_hal_fence_join calls. (#15569) by Ben Vanik · 1 year, 6 months ago
- dc506b8 Fix redundant IREE_UK_STATIC_ASSERT macro definition (#15567) by bjacob · 1 year, 6 months ago
- 4546b95 Simplify ukernel headers now that C++ is out of the picture (#15564) by bjacob · 1 year, 6 months ago
- 199ecee Simplify ukernel headers now that out-of-line asm is out of the picture (#15563) by bjacob · 1 year, 6 months ago