- 9902af7 Fix executator_demo main (#14508) by CindyLiu · 1 year, 9 months ago
- 73b00a0 In Python API create devices with default collectives channel provider (#14384) by Boian Petkantchin · 1 year, 9 months ago
- 3459230 Ukernels: mmt4d paths for the arm64 bf16 extension (#14495) by bjacob · 1 year, 9 months ago
- 80d06c1 [CUDA] Export external CUDA buffers to external device allocation (#14491) by Eugene Zhulenev · 1 year, 9 months ago
- 7f5857d Ukernels: mmt4d paths for arm64 fp16 extensions (#14490) by bjacob · 1 year, 9 months ago
- 38c36c6 Implement Python bindings for async HAL objects. (#14476) by Stella Laurenzo · 1 year, 9 months ago
- 1bb26cb [cuda] Implement HAL semaphore using CUevent objects (#14426) by Lei Zhang · 1 year, 9 months ago
- 2340900 Add `--iree-llvmcpu-skip-intermediate-roundings`, on by default. (#14478) by bjacob · 1 year, 9 months ago
- c2fbeaf ukernels: mmt4d: basic `f16` kernels converting to `f32` on x86-64 and arm64 (#14467) by bjacob · 1 year, 9 months ago
- 53e26c4 Move elementwise ukernels into vmvx/, the only user. (#14430) by bjacob · 1 year, 9 months ago
- 70d67c2 Clarify which ukernels are used on which backends. (#14455) by bjacob · 1 year, 9 months ago
- c003beb Make a couple of things configurable in setup.py scripts. (#14443) by Stella Laurenzo · 1 year, 9 months ago
- 280b14b Remove explicit Bazel public visibility from targets. (#14419) by Scott Todd · 1 year, 10 months ago
- 90499dd Adding native (non-VMA) Vulkan allocator behind a flag. (#14389) by Ben Vanik · 1 year, 10 months ago
- 80fd032 Improved f16/bf16 <-> f32 software conversions. (#14392) by bjacob · 1 year, 10 months ago
- b7c942a Make system_api.load_vm_flatbuffer_file mmap the file. (#14333) by Stella Laurenzo · 1 year, 10 months ago
- a68cc42 Change nanobind include to quoted form (#14376) by Jacques Pienaar · 1 year, 10 months ago
- e1e05d8 Adding iree_hal_vulkan_native_buffer_t. (#14374) by Ben Vanik · 1 year, 10 months ago
- d01a83c Make ukernel code inlinable down to arch-specific function pointer selection. (#14283) by bjacob · 1 year, 10 months ago
- edf18a5 Implement a timeout for benchmarking through python bindings. (#14261) by Kojo Acquah · 1 year, 10 months ago
- ccf886b [HAL] Allow iree_hal_buffer_view_shape to accept NULL out_shape (#14298) by Boian Petkantchin · 1 year, 10 months ago
- 23748f6 Add `iree-dump-module` to the python iree-runtime wheel. (#14243) by Scott Todd · 1 year, 10 months ago
- a37c817 ARM: detect more CPU features (#14253) by bjacob · 1 year, 10 months ago
- fbbd1ee Ukernels: basic support for float16 and bfloat16. (#14239) by bjacob · 1 year, 10 months ago
- 2c608e2 Add bfloat16 conversion helpers (#14238) by bjacob · 1 year, 10 months ago
- acf0b27 Fix the MacOS FindPython build issue. (#14233) by Stella Laurenzo · 1 year, 10 months ago
- abe91fa Port iree.runtime to nanobind. (#14214) by Stella Laurenzo · 1 year, 10 months ago
- 8bd48d9 Add nanobind to build requirements. (#14217) by Stella Laurenzo · 1 year, 10 months ago
- 1799e24 Add iree-cpuinfo to the python iree-runtime wheel. (#14209) by Stella Laurenzo · 1 year, 10 months ago
- e737184 Tag CTS tests with the driver they use (#14170) by Geoffrey Martin-Noble · 1 year, 10 months ago
- be24f02 Use Black to format Python files (#14161) by Jakub Kuderski · 1 year, 10 months ago
- 84b379e Fix the ukernels system build on older toolchains. (#14155) by bjacob · 1 year, 10 months ago
- 05c9b0d Use cuGetProcAddress to load CUDA entry points (#14056) by Trevor Morris · 1 year, 10 months ago
- 89a41d9 Adds memory mapping and alignment controls to VmModule construction. (#14153) by Stella Laurenzo · 1 year, 10 months ago
- df119bd Test the bring-your-own-LLVM path. (#14035) by bjacob · 1 year, 11 months ago
- e9061b3 Revert "Add VmModule.mmap() to Python API. (#14124)" by Stella Laurenzo · 1 year, 11 months ago
- 60b0764 Allow defining `IREE_HOST_SIZE_T` to other types. (#14040) by Scott Todd · 1 year, 11 months ago
- 3345b76 Add VmModule.mmap() to Python API. (#14124) by Stella Laurenzo · 1 year, 11 months ago
- 7ed4f4b Adding a console tracing provider and support for external ones. (#14113) by Ben Vanik · 1 year, 11 months ago
- 59e67a7 [metal] Enable compiling to Metal library when possible by Lei Zhang · 2 years, 1 month ago
- bb7a83e Create empty files without unnecessary timestamp updates. (#14097) by bjacob · 1 year, 11 months ago
- 14308b1 Cleaning up the iree/base/tracing.h header a bit. (#14089) by Ben Vanik · 1 year, 11 months ago
- b128e63 Fixing empty trace replay yaml lists for args/results. (#14088) by Ben Vanik · 1 year, 11 months ago
- cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 11 months ago
- 967ab3b Adding cuda.device :: compute_capability_major/minor queries. (#14033) by Ben Vanik · 1 year, 11 months ago
- e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 11 months ago
- 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 11 months ago
- 860026f Declare exported headers to Bazel to fix `bazel build --incompatible_no_implicit_file_export` (#13982) by Levon Ter-Grigoryan · 1 year, 11 months ago
- 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 11 months ago
- 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 11 months ago
- 38ae184 _MSC_VER comparison was the wrong way (#14007) by bjacob · 1 year, 11 months ago
- d9b9471 Retain the parent channel on a split in iree_hal_nccl_channel_t. (#13977) by Ben Vanik · 1 year, 11 months ago
- 8f9e962 Adding support for async memory pool allocations in the CUDA HAL. (#13440) by Ben Vanik · 1 year, 11 months ago
- bf3e1a2 Resetting collective batch when the CUDA command buffer arena is set. (#13978) by Ben Vanik · 1 year, 11 months ago
- ad321b6 Rollup of HAL/runtime/infra changes for WebGPU HAL. (#13953) by Scott Todd · 1 year, 11 months ago
- 571f28a Adding task system utilization tracing. (#13941) by Ben Vanik · 1 year, 11 months ago
- 5efe2d7 build cleanups (#13947) by bjacob · 1 year, 11 months ago
- 60b623a Improving iree_arena_t/iree_resource_set_t ASAN debugging. (#13939) by Ben Vanik · 1 year, 11 months ago
- bf8588e Microkernels: add arm64 bitcode. Test everywhere. (#13846) by bjacob · 1 year, 11 months ago
- 446a542 Correct 32bit/64bit separation in ukernel code. (#13878) by bjacob · 1 year, 11 months ago
- 295ded2 Drop unused `_none` values in ukernel enums (#13877) by bjacob · 1 year, 11 months ago
- 950e172 Deprecate MHLO input conversion pipeline (#13870) by Jakub Kuderski · 1 year, 11 months ago
- a5a532a Fix arm64 inline asm, was still referencing hardcoded register as in old out-of-line asm. (#13845) by bjacob · 1 year, 11 months ago
- 041b4e8 Separate architecture generic<->specific bitcode (#13825) by bjacob · 1 year, 11 months ago
- 67555c0 Drop conditionals and configured headers from the ukernels build (#13834) by bjacob · 1 year, 11 months ago
- bd5174b `iree_c_embed_data` improvements (#13814) by bjacob · 1 year, 11 months ago
- a8a70fb Add Promise wait API and loop_emscripten wait_* cmds. (#13669) by Scott Todd · 1 year, 11 months ago
- c1d499e Use correct pyobject for ref counting in `VmModule` pybindings (#13759) by Kojo Acquah · 1 year, 11 months ago
- cd7293e Reimplement ukernel arch-specific code path fallbacks as weak symbols. (#13715) by bjacob · 2 years ago
- d42d1d4 set -target, not -march (following up on #13708) (#13709) by bjacob · 2 years ago
- bd806c6 More fixes post #13460, #13703. (#13708) by bjacob · 2 years ago
- 6928af8 Removing errant printf in NCCL version check. by Ben Vanik · 2 years ago
- 81dcabe Print NCCL warning to stderr and add a newline. (#13707) by Stella Laurenzo · 2 years ago
- 2496f8d Windows and macOS fixes following #13460. (#13703) by Scott Todd · 2 years ago
- 29647b3 CPU ukernels as bitcode (x86-only for now) (#13460) by MaheshRavishankar · 2 years ago
- aa28b4a Add missing `inline` keywords to public header functions (#13689) by Niklas Haas · 2 years ago
- 90ed2d0 Adding util.cast/!util.object and lowering to vm.cast.* ops. (#13687) by Ben Vanik · 2 years ago
- 7016b8c Support mhlo.collective_permute with NCCL (#13502) by Trevor Morris · 2 years ago
- 6f81ceb Add module dependencies via python bindings (#13472) by Eugene Zhulenev · 2 years ago
- a50bc65 Adding export attribute reflection in native VM modules. (#13617) by Ben Vanik · 2 years ago
- f396c05 Moving cached rodata buffers to bytecode modules. (#13616) by Ben Vanik · 2 years ago
- 26d9eb8 Removing frame requirement from iree_vm_module_resolve_source_location. (#13618) by Ben Vanik · 2 years ago
- 41af5a1 return 0 in ukernels (#13613) by bjacob · 2 years ago
- 9e9d709 Fixing vm.switch.* op encoding. (#13611) by Ben Vanik · 2 years ago
- e7b8111 Swapping context/params order on CPU import functions. (#13600) by Ben Vanik · 2 years ago
- bb21d92 Fix many broken links across code and docs. (#13592) by Scott Todd · 2 years ago
- 17bcb02 Adding collective channel splitting to flow/stream/hal. (#13578) by Ben Vanik · 2 years ago
- dd977b1 Bumping NCCL to 2.18.1 in order to get ncclCommSplit. (#13569) by Ben Vanik · 2 years ago
- ef2bb52 Adding a VM implementation detail around expected import signatures. (#13562) by Ben Vanik · 2 years ago
- 03110de Skip command buffer copy/fill/dispatch when they are known no-op. (#13540) by Ben Vanik · 2 years ago
- cc0c7a8 Adding vm.round.fXX.even op. (#13525) by Ben Vanik · 2 years ago
- 4133b6e Removing VM verifier checks on return registers. (#13511) by Ben Vanik · 2 years ago
- 3dc368e Builtin ukernels as system/standalone plugins (#13433) by bjacob · 2 years ago
- 9e58489 Cleanup MPI error handling. (#13315) by Calin Cascaval · 2 years ago
- e040486 [NCCL] check version first before loading symbols (#13432) by Okwan Kwon · 2 years ago
- 8aa35a4 Removing asserts from the exported CUDA device methods. (#13429) by Ben Vanik · 2 years ago
- 93781b3 Pass IREE_UK_FLAG_MMT4D_ACCUMULATE_BIT_POS as immediate (#13410) by bjacob · 2 years ago
- 75cbdf8 Removing iree_hal_command_buffer_dyn_cast from the HAL. (#13408) by Ben Vanik · 2 years ago
- 2fb5a54 Add presubmit check for BUILD.bazel files (#13380) by Tori Baker · 2 years ago
- 7520cad ukernel/mmt4d/arm64: convert out-of-line asm to intrinsics and inline asm. (#13383) by bjacob · 2 years ago