1. 9b47492 [metal] Implement descriptor set and pipeline layout APIs by Lei Zhang · 2 years, 2 months ago
  2. 33df9dd [metal] Use Metal shared event to implement IREE semaphore APIs by Lei Zhang · 2 years, 2 months ago
  3. f9802f0 [metal] Wire up creating devices, allocators, and buffers by Lei Zhang · 2 years, 3 months ago
  4. 8a2330f [metal] Implement Metal allocator and buffer APIs by Lei Zhang · 2 years, 3 months ago
  5. 8f36024 [metal] Add build and registration for a Metal HAL driver by Lei Zhang · 2 years, 3 months ago
  6. bed2763 Clean up code of legacy benchmark suite (#14081) by Jerry Wu · 1 year, 10 months ago
  7. 85eb21b [cuda] Port over tracing utilities and use in NCCL channel (#14063) by Lei Zhang · 1 year, 10 months ago
  8. 85a1a56 [cuda] Port over native executable and its cache (#14062) by Lei Zhang · 1 year, 10 months ago
  9. 330771e [cuda] Port over channel implementation via NCCL (#14059) by Lei Zhang · 1 year, 10 months ago
  10. f4a0bdf Integrate llvm-project at 0258a53521cf and bump dependencies (#14065) by Diego Caballero · 1 year, 10 months ago
  11. 528337e [LLVMCPU] Make LLVMCPUTile prefer using the lowering_config of tiled op. (#13922) by Han-Chung Wang · 1 year, 10 months ago
  12. afdc6f6 Fixing dynamic complex number splats in stream encoding. (#14100) by Ben Vanik · 1 year, 10 months ago
  13. dd5aac1 Add dummy Linalg ops to SSVE tests (#14099) by Andrzej Warzyński · 1 year, 10 months ago
  14. 27448a8 [NFC] Omitting default N=4 from SmallVector and to_vector. (#13933) by Han-Chung Wang · 1 year, 10 months ago
  15. e1ab9aa [ci] Update fetch_cuda_toolkit.py to use 12.1.1 for releases (#14093) by Lei Zhang · 1 year, 10 months ago
  16. 074a12c [LLVMCPU] Make LLVMCPUTileAndFuse use consumer's config if possible. (#13920) by Han-Chung Wang · 1 year, 10 months ago
  17. 239e393 Check `nvidia-bleeding-edge` during image setup (#13961) by Jerry Wu · 1 year, 10 months ago
  18. 29e8ed0 Remove redundant `requires-gpu-nvidia` added in #14039 (#14076) by Geoffrey Martin-Noble · 1 year, 10 months ago
  19. bb7a83e Create empty files without unnecessary timestamp updates. (#14097) by bjacob · 1 year, 10 months ago
  20. 2bb9a54 Add `iree-hal-dump-executable-files-to` meta flag. (#14096) by Scott Todd · 1 year, 10 months ago
  21. bf09487 Stripping jax/tf/mhlo attributes from stablehlo inputs. (#14092) by Ben Vanik · 1 year, 10 months ago
  22. bd80c9a Re-fixing 6888e61c45d117f177e5c0050c83f11be9199c8f MSVC break. by Ben Vanik · 1 year, 10 months ago
  23. c1b41f2 [CI][StableHLO] Auto-advance stablehlo fork (#14087) by Jakub Kuderski · 1 year, 10 months ago
  24. 6888e61 Fixing MSVC build break from #13971. by Ben Vanik · 1 year, 10 months ago
  25. 14308b1 Cleaning up the iree/base/tracing.h header a bit. (#14089) by Ben Vanik · 1 year, 10 months ago
  26. 71fc1f9 Fixing comma-separated lists in iree-run-mlir `--Xcompiler,` args. (#14075) by Ben Vanik · 1 year, 10 months ago
  27. b128e63 Fixing empty trace replay yaml lists for args/results. (#14088) by Ben Vanik · 1 year, 10 months ago
  28. e0d36f9 Fuse iota ops with consumers always. (#14070) by MaheshRavishankar · 1 year, 10 months ago
  29. 389152b Update docs to remove iree folder that no longer exists (#13306) by Tori Baker · 1 year, 10 months ago
  30. c457c30 [GPUCheckResourceUsage] Don't choke on alloc of memref of index (#14001) by qcolombet · 1 year, 10 months ago
  31. f9d1525 Harden the MatmulTensorCore strategy (#13971) by Nicolas Vasilache · 1 year, 10 months ago
  32. 786ab7a Disable bf16-to-f32 promotion pass default (#14078) by Rob Suderman · 1 year, 10 months ago
  33. 75be6d8 Clone complex.* producers into dispatch regions (#14036) by Rob Suderman · 1 year, 10 months ago
  34. 106d68b Allowing complex types as dispatch operands. (#14037) by Ben Vanik · 1 year, 10 months ago
  35. 20186df Shortening TransformDialectStrategies to TransformStrategies. (#14073) by Ben Vanik · 1 year, 10 months ago
  36. 31c2d24 [cuda] Dump more synchronization related attributes (#14074) by Lei Zhang · 1 year, 10 months ago
  37. c104444 Fix the ignored list of markdownlint (#14064) by Jerry Wu · 1 year, 10 months ago
  38. 96d9213 Reword comments for `IREE_BUILD_DOCS`. (#14069) by Scott Todd · 1 year, 10 months ago
  39. 441ae46 [spirv] Dump spirv.module for Metal and WebGPU targets (#14066) by Lei Zhang · 1 year, 10 months ago
  40. e94415b [spirv] Dump spirv.module with dump-executable-intermediates-to= (#14058) by Lei Zhang · 1 year, 10 months ago
  41. 02cac7a [llvmgpu] Target CUDA sm_60 architecture by default (#14057) by Lei Zhang · 1 year, 10 months ago
  42. 0dbf086 Remove MHLO support (#14008) by Jakub Kuderski · 1 year, 10 months ago
  43. e24089d [WebGPU] Async, loop-based invoke and output. (#13962) by Scott Todd · 1 year, 10 months ago
  44. 45a3eb4 [cuda] Port over descriptor set and pipeline layout (#14038) by Lei Zhang · 1 year, 10 months ago
  45. 2dddf02 Correctly tag Vulkan Ampere tests as requiring sm80 (#14039) by Geoffrey Martin-Noble · 1 year, 10 months ago
  46. f71aebc [StableHLO] Make reduce lowering more robust (#14046) by Jakub Kuderski · 1 year, 10 months ago
  47. 23666a6 Add a new benchmark and document steps: Add a new unaligned matmul test that will exercise failsafes to avoid bad configurations (#14052) by Nicolas Vasilache · 1 year, 10 months ago
  48. 88920b4 Drop some patterns from IREE apply_patterns op (#14053) by Matthias Springer · 1 year, 10 months ago
  49. 8357ea4 Use upstreamed ApplyPatternsOp in buildCanonicalizationAndEnablingTransforms (#14026) by Matthias Springer · 1 year, 10 months ago
  50. 807da25 [ConvertToLLVM] Don't choke on alloc of memref of index (#14002) by qcolombet · 1 year, 10 months ago
  51. cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 11 months ago
  52. 651630e [StableHLO] Port reduce canon patterns (#14045) by Jakub Kuderski · 1 year, 11 months ago
  53. fa3a220 Fix bug in TopK pattern matching that sets K as 2nd dim rather than last. (#14041) by NatashaKnk · 1 year, 11 months ago
  54. c4e01e9 [cuda] Wire up basic creating devices, allocators, and buffers (#14011) by Lei Zhang · 1 year, 11 months ago
  55. 967ab3b Adding cuda.device :: compute_capability_major/minor queries. (#14033) by Ben Vanik · 1 year, 11 months ago
  56. 7514d3e Add pattern to map iota->sort->slice to topK (#13972) by NatashaKnk · 1 year, 11 months ago
  57. e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 11 months ago
  58. b3a1ac9 Bump LLVM to llvm/llvm-project@faae4d5d (#14023) by Matthias Springer · 1 year, 11 months ago
  59. 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 11 months ago
  60. 860026f Declare exported headers to Bazel to fix `bazel build --incompatible_no_implicit_file_export` (#13982) by Levon Ter-Grigoryan · 1 year, 11 months ago
  61. 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 11 months ago
  62. c941946 Exposing device configuration higher in the stack. (#14009) by Ben Vanik · 1 year, 11 months ago
  63. 3bd115a Refresh deployment-configuration website pages. (#14013) by Scott Todd · 1 year, 11 months ago
  64. c12bf50 Harden the PadOp matcher (#14024) by Nicolas Vasilache · 1 year, 11 months ago
  65. c620ab2 Clean up CapturingOpMatcher and derived classes, NFC (#14022) by Oleksandr "Alex" Zinenko · 1 year, 11 months ago
  66. f3d0369 Add codegen strategy for GPU padding (#14000) by Nicolas Vasilache · 1 year, 11 months ago
  67. 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 11 months ago
  68. 0e03852 Test with ASAN in bytecode modules (#14005) by bjacob · 1 year, 11 months ago
  69. 6a87b03 Add tags to website pages. (#14006) by Scott Todd · 1 year, 11 months ago
  70. 38ae184 _MSC_VER comparison was the wrong way (#14007) by bjacob · 1 year, 11 months ago
  71. 975ba03 Adds support for mixed precision NVIDIA A100 Tensor Cores (F32 <= F16 * F16 + F32) (#13857) by Manish Gupta · 1 year, 11 months ago
  72. 6a61d9f [cuda] Dump whether the device has integrated memory (#13986) by Lei Zhang · 1 year, 11 months ago
  73. ef52dbf Restructure website sections and navigation. (#13991) by Scott Todd · 1 year, 11 months ago
  74. d9b9471 Retain the parent channel on a split in iree_hal_nccl_channel_t. (#13977) by Ben Vanik · 1 year, 11 months ago
  75. 8f9e962 Adding support for async memory pool allocations in the CUDA HAL. (#13440) by Ben Vanik · 1 year, 11 months ago
  76. aced620 Lowering flow.tensor.alloca (renamed) to stream.async.alloca. (#13998) by Ben Vanik · 1 year, 11 months ago
  77. 4e48c13 Moving builtins lower in the pipeline and adding option to force. (#13994) by Ben Vanik · 1 year, 11 months ago
  78. 1299742 [cuda] Port over allocator and buffer implementation (#13985) by Lei Zhang · 1 year, 11 months ago
  79. 0dc5de9 Route demotion flag to Input options (#13993) by Rob Suderman · 1 year, 11 months ago
  80. 377d27a Integrate llvm-project at https://github.com/llvm/llvm-project/commit/85b77b13e3bcccffeb84b09365e0ab96565467fa (#13975) by MaheshRavishankar · 1 year, 11 months ago
  81. bf3e1a2 Resetting collective batch when the CUDA command buffer arena is set. (#13978) by Ben Vanik · 1 year, 11 months ago
  82. 74f6a6a Remove redundant newlines in generated benchmark cmake files (#13979) by Jerry Wu · 1 year, 11 months ago
  83. 4c1ceb2 Make reduction/matmul/conv matchers optionally partial (#13981) by Oleksandr "Alex" Zinenko · 1 year, 11 months ago
  84. 4e9b3bd [LLVMCPU] Add pass to enable Armv9 Streaming SVE mode (#13558) by Cullen Rhodes · 1 year, 11 months ago
  85. 0b91c98 Re-enable bf16 native execution for the StableHLO Path (#13976) by Rob Suderman · 1 year, 11 months ago
  86. 815e843 Make IREEDialectsTransforms cmake target export includes (#13980) by Oleksandr "Alex" Zinenko · 1 year, 11 months ago
  87. daabcf7 Unbreak the `byo_llvm.sh` build with `iree_bitcode_library`. (#13968) by bjacob · 1 year, 11 months ago
  88. 21b41db Create `iree-benchmark-import-models-large` for large benchmarks (#13963) by Jerry Wu · 1 year, 11 months ago
  89. 5da8c71 Add `iree-stream-resource-alias-mutable-bindings` flag. (#13965) by Scott Todd · 1 year, 11 months ago
  90. 18a1bac Clean up build_linux_packages.sh to address comments (#13938) by powderluv · 1 year, 11 months ago
  91. 00dd8a3 Update build_tools/python_deploy/build_linux_packages.sh by powderluv · 1 year, 11 months ago
  92. 0f641e6 Add experimental WebGPU HAL backend. (#13952) by Scott Todd · 1 year, 11 months ago
  93. 6d60a12 Add WebGPU sample application and update other web demos. by Scott Todd · 1 year, 11 months ago
  94. e7c2cba Initial WebGPU HAL implementation. by Ben Vanik · 3 years, 9 months ago
  95. 6d6f54b Fix tests/transform_dialect/cuda/ dependencies (#13949) by Levon Ter-Grigoryan · 1 year, 11 months ago
  96. ebea998 [ci] Update CUDA toolkit to v12.1.1 (#13875) by Lei Zhang · 1 year, 11 months ago
  97. b5e45cd [ci] Update NVIDIA driver packages to v530 in docker images (#13912) by Lei Zhang · 1 year, 11 months ago
  98. 508247f Add CODEOWNERS for experimental directories (#13956) by Geoffrey Martin-Noble · 1 year, 11 months ago
  99. 3fa5ad3 [StableHLO] Port Philox rng (#13844) by jvstokes · 1 year, 11 months ago
  100. 686860c [cuda] Dump useful GPU characteristics (#13955) by Lei Zhang · 1 year, 11 months ago