1. 9902af7 Fix executator_demo main (#14508) by CindyLiu · 1 year, 9 months ago
  2. 73b00a0 In Python API create devices with default collectives channel provider (#14384) by Boian Petkantchin · 1 year, 9 months ago
  3. 3459230 Ukernels: mmt4d paths for the arm64 bf16 extension (#14495) by bjacob · 1 year, 9 months ago
  4. 80d06c1 [CUDA] Export external CUDA buffers to external device allocation (#14491) by Eugene Zhulenev · 1 year, 9 months ago
  5. 7f5857d Ukernels: mmt4d paths for arm64 fp16 extensions (#14490) by bjacob · 1 year, 9 months ago
  6. 38c36c6 Implement Python bindings for async HAL objects. (#14476) by Stella Laurenzo · 1 year, 9 months ago
  7. 1bb26cb [cuda] Implement HAL semaphore using CUevent objects (#14426) by Lei Zhang · 1 year, 9 months ago
  8. 2340900 Add `--iree-llvmcpu-skip-intermediate-roundings`, on by default. (#14478) by bjacob · 1 year, 9 months ago
  9. c2fbeaf ukernels: mmt4d: basic `f16` kernels converting to `f32` on x86-64 and arm64 (#14467) by bjacob · 1 year, 9 months ago
  10. 53e26c4 Move elementwise ukernels into vmvx/, the only user. (#14430) by bjacob · 1 year, 9 months ago
  11. 70d67c2 Clarify which ukernels are used on which backends. (#14455) by bjacob · 1 year, 9 months ago
  12. c003beb Make a couple of things configurable in setup.py scripts. (#14443) by Stella Laurenzo · 1 year, 9 months ago
  13. 280b14b Remove explicit Bazel public visibility from targets. (#14419) by Scott Todd · 1 year, 10 months ago
  14. 90499dd Adding native (non-VMA) Vulkan allocator behind a flag. (#14389) by Ben Vanik · 1 year, 10 months ago
  15. 80fd032 Improved f16/bf16 <-> f32 software conversions. (#14392) by bjacob · 1 year, 10 months ago
  16. b7c942a Make system_api.load_vm_flatbuffer_file mmap the file. (#14333) by Stella Laurenzo · 1 year, 10 months ago
  17. a68cc42 Change nanobind include to quoted form (#14376) by Jacques Pienaar · 1 year, 10 months ago
  18. e1e05d8 Adding iree_hal_vulkan_native_buffer_t. (#14374) by Ben Vanik · 1 year, 10 months ago
  19. d01a83c Make ukernel code inlinable down to arch-specific function pointer selection. (#14283) by bjacob · 1 year, 10 months ago
  20. edf18a5 Implement a timeout for benchmarking through python bindings. (#14261) by Kojo Acquah · 1 year, 10 months ago
  21. ccf886b [HAL] Allow iree_hal_buffer_view_shape to accept NULL out_shape (#14298) by Boian Petkantchin · 1 year, 10 months ago
  22. 23748f6 Add `iree-dump-module` to the python iree-runtime wheel. (#14243) by Scott Todd · 1 year, 10 months ago
  23. a37c817 ARM: detect more CPU features (#14253) by bjacob · 1 year, 10 months ago
  24. fbbd1ee Ukernels: basic support for float16 and bfloat16. (#14239) by bjacob · 1 year, 10 months ago
  25. 2c608e2 Add bfloat16 conversion helpers (#14238) by bjacob · 1 year, 10 months ago
  26. acf0b27 Fix the MacOS FindPython build issue. (#14233) by Stella Laurenzo · 1 year, 10 months ago
  27. abe91fa Port iree.runtime to nanobind. (#14214) by Stella Laurenzo · 1 year, 10 months ago
  28. 8bd48d9 Add nanobind to build requirements. (#14217) by Stella Laurenzo · 1 year, 10 months ago
  29. 1799e24 Add iree-cpuinfo to the python iree-runtime wheel. (#14209) by Stella Laurenzo · 1 year, 10 months ago
  30. e737184 Tag CTS tests with the driver they use (#14170) by Geoffrey Martin-Noble · 1 year, 10 months ago
  31. be24f02 Use Black to format Python files (#14161) by Jakub Kuderski · 1 year, 10 months ago
  32. 84b379e Fix the ukernels system build on older toolchains. (#14155) by bjacob · 1 year, 10 months ago
  33. 05c9b0d Use cuGetProcAddress to load CUDA entry points (#14056) by Trevor Morris · 1 year, 10 months ago
  34. 89a41d9 Adds memory mapping and alignment controls to VmModule construction. (#14153) by Stella Laurenzo · 1 year, 10 months ago
  35. df119bd Test the bring-your-own-LLVM path. (#14035) by bjacob · 1 year, 11 months ago
  36. e9061b3 Revert "Add VmModule.mmap() to Python API. (#14124)" by Stella Laurenzo · 1 year, 11 months ago
  37. 60b0764 Allow defining `IREE_HOST_SIZE_T` to other types. (#14040) by Scott Todd · 1 year, 11 months ago
  38. 3345b76 Add VmModule.mmap() to Python API. (#14124) by Stella Laurenzo · 1 year, 11 months ago
  39. 7ed4f4b Adding a console tracing provider and support for external ones. (#14113) by Ben Vanik · 1 year, 11 months ago
  40. 59e67a7 [metal] Enable compiling to Metal library when possible by Lei Zhang · 2 years, 1 month ago
  41. bb7a83e Create empty files without unnecessary timestamp updates. (#14097) by bjacob · 1 year, 11 months ago
  42. 14308b1 Cleaning up the iree/base/tracing.h header a bit. (#14089) by Ben Vanik · 1 year, 11 months ago
  43. b128e63 Fixing empty trace replay yaml lists for args/results. (#14088) by Ben Vanik · 1 year, 11 months ago
  44. cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 11 months ago
  45. 967ab3b Adding cuda.device :: compute_capability_major/minor queries. (#14033) by Ben Vanik · 1 year, 11 months ago
  46. e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 11 months ago
  47. 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 11 months ago
  48. 860026f Declare exported headers to Bazel to fix `bazel build --incompatible_no_implicit_file_export` (#13982) by Levon Ter-Grigoryan · 1 year, 11 months ago
  49. 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 11 months ago
  50. 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 11 months ago
  51. 38ae184 _MSC_VER comparison was the wrong way (#14007) by bjacob · 1 year, 11 months ago
  52. d9b9471 Retain the parent channel on a split in iree_hal_nccl_channel_t. (#13977) by Ben Vanik · 1 year, 11 months ago
  53. 8f9e962 Adding support for async memory pool allocations in the CUDA HAL. (#13440) by Ben Vanik · 1 year, 11 months ago
  54. bf3e1a2 Resetting collective batch when the CUDA command buffer arena is set. (#13978) by Ben Vanik · 1 year, 11 months ago
  55. ad321b6 Rollup of HAL/runtime/infra changes for WebGPU HAL. (#13953) by Scott Todd · 1 year, 11 months ago
  56. 571f28a Adding task system utilization tracing. (#13941) by Ben Vanik · 1 year, 11 months ago
  57. 5efe2d7 build cleanups (#13947) by bjacob · 1 year, 11 months ago
  58. 60b623a Improving iree_arena_t/iree_resource_set_t ASAN debugging. (#13939) by Ben Vanik · 1 year, 11 months ago
  59. bf8588e Microkernels: add arm64 bitcode. Test everywhere. (#13846) by bjacob · 1 year, 11 months ago
  60. 446a542 Correct 32bit/64bit separation in ukernel code. (#13878) by bjacob · 1 year, 11 months ago
  61. 295ded2 Drop unused `_none` values in ukernel enums (#13877) by bjacob · 1 year, 11 months ago
  62. 950e172 Deprecate MHLO input conversion pipeline (#13870) by Jakub Kuderski · 1 year, 11 months ago
  63. a5a532a Fix arm64 inline asm, was still referencing hardcoded register as in old out-of-line asm. (#13845) by bjacob · 1 year, 11 months ago
  64. 041b4e8 Separate architecture generic<->specific bitcode (#13825) by bjacob · 1 year, 11 months ago
  65. 67555c0 Drop conditionals and configured headers from the ukernels build (#13834) by bjacob · 1 year, 11 months ago
  66. bd5174b `iree_c_embed_data` improvements (#13814) by bjacob · 1 year, 11 months ago
  67. a8a70fb Add Promise wait API and loop_emscripten wait_* cmds. (#13669) by Scott Todd · 1 year, 11 months ago
  68. c1d499e Use correct pyobject for ref counting in `VmModule` pybindings (#13759) by Kojo Acquah · 1 year, 11 months ago
  69. cd7293e Reimplement ukernel arch-specific code path fallbacks as weak symbols. (#13715) by bjacob · 2 years ago
  70. d42d1d4 set -target, not -march (following up on #13708) (#13709) by bjacob · 2 years ago
  71. bd806c6 More fixes post #13460, #13703. (#13708) by bjacob · 2 years ago
  72. 6928af8 Removing errant printf in NCCL version check. by Ben Vanik · 2 years ago
  73. 81dcabe Print NCCL warning to stderr and add a newline. (#13707) by Stella Laurenzo · 2 years ago
  74. 2496f8d Windows and macOS fixes following #13460. (#13703) by Scott Todd · 2 years ago
  75. 29647b3 CPU ukernels as bitcode (x86-only for now) (#13460) by MaheshRavishankar · 2 years ago
  76. aa28b4a Add missing `inline` keywords to public header functions (#13689) by Niklas Haas · 2 years ago
  77. 90ed2d0 Adding util.cast/!util.object and lowering to vm.cast.* ops. (#13687) by Ben Vanik · 2 years ago
  78. 7016b8c Support mhlo.collective_permute with NCCL (#13502) by Trevor Morris · 2 years ago
  79. 6f81ceb Add module dependencies via python bindings (#13472) by Eugene Zhulenev · 2 years ago
  80. a50bc65 Adding export attribute reflection in native VM modules. (#13617) by Ben Vanik · 2 years ago
  81. f396c05 Moving cached rodata buffers to bytecode modules. (#13616) by Ben Vanik · 2 years ago
  82. 26d9eb8 Removing frame requirement from iree_vm_module_resolve_source_location. (#13618) by Ben Vanik · 2 years ago
  83. 41af5a1 return 0 in ukernels (#13613) by bjacob · 2 years ago
  84. 9e9d709 Fixing vm.switch.* op encoding. (#13611) by Ben Vanik · 2 years ago
  85. e7b8111 Swapping context/params order on CPU import functions. (#13600) by Ben Vanik · 2 years ago
  86. bb21d92 Fix many broken links across code and docs. (#13592) by Scott Todd · 2 years ago
  87. 17bcb02 Adding collective channel splitting to flow/stream/hal. (#13578) by Ben Vanik · 2 years ago
  88. dd977b1 Bumping NCCL to 2.18.1 in order to get ncclCommSplit. (#13569) by Ben Vanik · 2 years ago
  89. ef2bb52 Adding a VM implementation detail around expected import signatures. (#13562) by Ben Vanik · 2 years ago
  90. 03110de Skip command buffer copy/fill/dispatch when they are known no-op. (#13540) by Ben Vanik · 2 years ago
  91. cc0c7a8 Adding vm.round.fXX.even op. (#13525) by Ben Vanik · 2 years ago
  92. 4133b6e Removing VM verifier checks on return registers. (#13511) by Ben Vanik · 2 years ago
  93. 3dc368e Builtin ukernels as system/standalone plugins (#13433) by bjacob · 2 years ago
  94. 9e58489 Cleanup MPI error handling. (#13315) by Calin Cascaval · 2 years ago
  95. e040486 [NCCL] check version first before loading symbols (#13432) by Okwan Kwon · 2 years ago
  96. 8aa35a4 Removing asserts from the exported CUDA device methods. (#13429) by Ben Vanik · 2 years ago
  97. 93781b3 Pass IREE_UK_FLAG_MMT4D_ACCUMULATE_BIT_POS as immediate (#13410) by bjacob · 2 years ago
  98. 75cbdf8 Removing iree_hal_command_buffer_dyn_cast from the HAL. (#13408) by Ben Vanik · 2 years ago
  99. 2fb5a54 Add presubmit check for BUILD.bazel files (#13380) by Tori Baker · 2 years ago
  100. 7520cad ukernel/mmt4d/arm64: convert out-of-line asm to intrinsics and inline asm. (#13383) by bjacob · 2 years ago