1. 17bba14 Giving command buffer begin_debug_group/end_debug_group status. (#18998) by Ben Vanik · 4 months ago
  2. fa3a144 Adding iree_hal_device_queue_update and improving queue DMA operations. (#19000) by Ben Vanik · 4 months ago
  3. 2b1a8e7 [LLVMGPU] Make FP8 VMFMA intrinsic discoverable by KernelConfig (#19022) by Stanley Winata · 4 months ago
  4. 1e1a6e3 Adding iree_hal_device_queue_update. by Ben Vanik · 5 months ago
  5. 1444755 [LLVMGPU] Add VMFMA for FP8 to align layouts between chained F8 contractions. (#19020) by Stanley Winata · 4 months ago
  6. f71dd12 Integrate llvm-project@7c69491 (#19008) by Quinn Dawkins · 4 months ago
  7. e7dc6f0 Updating torch-mlir of iree to 6aa46967b69a01a46d56146250978d08e243e75e (#18992) by Chi_Liu · 4 months ago
  8. 9e20e68 [LLVMGPU] Enable IGEMM for convolutions by default (#19006) by Max191 · 4 months ago
  9. a757965 Clean up e2e matmul tests on CDNA3 (#18979) by Benoit Jacob · 4 months ago
  10. a5537bc [LLVMGPU] Teach KernelConfig to set MMA schedules per op in LoweringConfig (#18984) by Stanley Winata · 4 months ago
  11. ec7528c [Codegen][VectorExt] Fix VectorExt ops for 0-d vectors (#18915) by Kunwar Grover · 4 months ago
  12. 632bc11 Making iree_hal_device_queue_execute take zero or one command buffer. by Ben Vanik · 5 months ago
  13. 47feccd Replacing iree_hal_command_buffer_discard_buffer with advise_buffer. by Ben Vanik · 5 months ago
  14. a483e4b Adding flags to fill/update/copy commands and vtabling queue fill/copy. by Ben Vanik · 5 months ago
  15. 9c85e30 [iree.build] Implement iree-compile action. (#18993) by Stella Laurenzo · 5 months ago
  16. c3b1e6e Fixing executable debug info as identified in #18996. (#18997) by Ben Vanik · 5 months ago
  17. 05ec795 Add `InferIntDivisibilityInterface` for `arith.muli`. (#18994) by MaheshRavishankar · 5 months ago
  18. 2f15eeb Integrates/llvm 20241101@e577f14 (#18987) by Bangtian Liu · 5 months ago
  19. 3bb7fd2 Use `IntegerRangeAnalysis` to get bounds of allocation. (#18991) by MaheshRavishankar · 5 months ago
  20. 046a705 [LLVMGPU] Create GPU pipeline option for IGEMM (#18981) by Max191 · 5 months ago
  21. 8bfcc85 Windows CI: update e2e matmul/VMVX test exclusion list (#18990) by Benoit Jacob · 5 months ago
  22. db59070 [GPU] Disable unaligned to instrinsic batch matmul codegen with vector distribute (#18935) by Nirvedh Meshram · 5 months ago
  23. bb542ee [LLVMGPU] Add Virtual MFMA layout that maximizes load through adjusted K-width (#18930) by Stanley Winata · 5 months ago
  24. 20c8347 Fix typo in tile_and_distribute_to_workgroups.mlir test (#18982) by Max191 · 5 months ago
  25. 0077358 [python] Add an iree.build package with API/tooling for program building. (#18630) by Stella Laurenzo · 5 months ago
  26. 8d3faf8 Revert "Propagate reshapes through generics with reduction… (#18968) by Ian Wood · 5 months ago
  27. 124114b Only add cpu-features test suffixes on `llvm-cpu` target backend (#18964) by Benoit Jacob · 5 months ago
  28. 57fb10f [NFC] Cleanups to flow op folders. (#18974) by Ben Vanik · 5 months ago
  29. dc43032 Using standard C++ atomic enums on MSVC. (#18970) by Ben Vanik · 5 months ago
  30. 348dd47 Adding a HAL module debug sink interface. (#18966) by Ben Vanik · 5 months ago
  31. df2b8a4 Integrate llvm-project @f1595ecfdce5387e41826fd72ff930a1a39ae398 (#18897) by Max191 · 5 months ago
  32. b2b2d00 [python] Don't use cache in DeviceArray.to_host if not mappable to host and fix _is_mappable (#18963) by Boian Petkantchin · 5 months ago
  33. b38de27 Adjust `isFusableUsingTileAndFuse` in `SinkReshapes` (#18921) by Ian Wood · 5 months ago
  34. 2ec9017 Improving VM conversion performance. (#18957) by Ben Vanik · 5 months ago
  35. a744285 [Codegen][LLVMGPU] Set global read layouts at linalg level (#18860) by Kunwar Grover · 5 months ago
  36. 12cb042 [Flow] Add pattern to canonicalize away full tensor.insert_slice ops (#18941) by Quinn Dawkins · 5 months ago
  37. 53813e8 [LLVMGPU] Use flat workgroup sizes in vector distribution (#18947) by Kunwar Grover · 5 months ago
  38. 78481a6 Propagate reshapes through generics with reduction iterators (#18857) by Ian Wood · 5 months ago
  39. 14f58e0 [ROCM] Turn on SLP vectorization (#18949) by Nirvedh Meshram · 5 months ago
  40. 554f31f Adding a flag to force indirect command buffers on in non-reusable cases. (#18945) by Ben Vanik · 5 months ago
  41. 1f76cb7 GPU data tiling: reimplement getConcreteMFMALayout (#18953) by Benoit Jacob · 5 months ago
  42. 0bb6d92 Add `ReifyRankedShapedTypeOpInterface` to `hal.interface.binding.subspan` (#18946) by MaheshRavishankar · 5 months ago
  43. 26ba4fd Switching VM's EraseUnusedCallOp pattern to a pass. (#18950) by Ben Vanik · 5 months ago
  44. 15ea0dc [GlobalOpt] Prevent fusing transposed extend in RaiseSpecialOps (#18901) by Cullen Rhodes · 5 months ago
  45. 5fc340d Registering the ROCDL dialect in init_mlir_dialects. (#18944) by Ben Vanik · 5 months ago
  46. d1dd3e3 Add integer range inference to hal.buffer_view.dim and rank ops. (#18943) by Stella Laurenzo · 5 months ago
  47. 49ffdac Enabling linking in the ROCM/CUDA compiler targets. (#18936) by Ben Vanik · 5 months ago
  48. a321be2 Adding 'amdgpu' target device and flatbuffer for HAL executables. (#18933) by Ben Vanik · 5 months ago
  49. 4376117 [GPU] Do not treat pad as a tilable producer for operand promotion (#18918) by Kunwar Grover · 5 months ago
  50. 3cf5b65 [LinalgExt] Implement AggregateOpInterface for AttentionOp (#18890) by Kunwar Grover · 5 months ago
  51. b31b033 Revert "[DispatchCreation] Run preprocessing before..." (#18934) by Ian Wood · 5 months ago
  52. 3eeea7f Work around missing pybind in BYO LLVM build (#18916) by Marius Brehler · 5 months ago
  53. 36caa05 [ROCM] Add flag to enable GlobalISel (#18922) by Quinn Dawkins · 5 months ago
  54. fa752ae [DispatchCreation] Run preprocessing before elementwise fusion (#18920) by Ian Wood · 5 months ago
  55. 3b69679 Enable the MLIR debug actions CL options in the compiler driver. (#18928) by Stella Laurenzo · 5 months ago
  56. f4a5f13 Use workgroup_count_from_slice in Stream builtins (#18924) by Quinn Dawkins · 5 months ago
  57. 67ba1c4 Fixing missing parameters module fork_state vtable entry. by Ben Vanik · 5 months ago
  58. 9d36cfa [Codegen] Don't require full slice to decompose boundary pack and unpack ops (#18906) by Max191 · 5 months ago
  59. e66171a [LinalgExt] Generalize attribute setting for attention decomposition (#18780) by Kunwar Grover · 5 months ago
  60. a041798 [VectorDistribution] Add vector distribution support multi-dim reduction with scalars (#18800) by Bangtian Liu · 5 months ago
  61. 8806173 Revert "[DispatchCreation] Extend multi-use producer fusion" (#18917) by Ian Wood · 5 months ago
  62. f8b8414 Modernizing iree_atomic_*. (#18910) by Ben Vanik · 5 months ago
  63. 4823dc0 Adding HAL semaphore support for statuses-as-failure-payloads. (#18912) by Ben Vanik · 5 months ago
  64. bb7ece7 Minor comment/style tweaks to the null HAL driver. (#18911) by Ben Vanik · 5 months ago
  65. 7f14078 Adding build config for HSA runtime headers. (#18909) by Ben Vanik · 5 months ago
  66. 9731fed Pass to block dynamic dimensions of operands of `iree_linalg_ext.attention`. (#18874) by MaheshRavishankar · 5 months ago
  67. 03c744e [GPU] Support multiple contraction dims in MmaSchedules (#18720) by Max191 · 5 months ago
  68. 0c2c627 [NFC] Update old naming from flow to dispatch creation (#18904) by Ian Wood · 5 months ago
  69. 55c5562 [LLVMGPU][NFC] Create LLVMGPU pass for IGEMM (#18871) by Max191 · 5 months ago
  70. c6b3592 [Dispatch Creation] Bubble up ExtractSliceOp with FillOp when the latter has multiple consumers (#18896) by Nithin Meganathan · 5 months ago
  71. 1aa5825 [LLVMGPU] Combine parallel and reduction padding in LLVMGPUPadAndVectorDistribute (#18771) by Kunwar Grover · 5 months ago
  72. 1fc6e5b Add CDNA3 MFMA BF16 intrinsics. (#18892) by Benoit Jacob · 5 months ago
  73. 3b751a4 [LLVMCPU] Enable tileDispatchUsingForall as default (#18777) by Prashant Kumar · 5 months ago
  74. e96e3c0 [VectorLayout] Fix insertion of new constOp for non dominate issue. (#18894) by Stanley Winata · 5 months ago
  75. aef6e1f [GPU] Bail out in GPUReduceBankConflicts if we have collapse_shape user (#18863) by Nirvedh Meshram · 5 months ago
  76. 8ce8bed Simplifications in e2e matmul tests (#18889) by Benoit Jacob · 5 months ago
  77. 225baf2 Add e2e tests for F8E5M2FNUZ and F8E4M3FNUZ data-tiled MFMA on CDNA3 (#18888) by Benoit Jacob · 5 months ago
  78. 4ad834b Support F8E5M2FNUZ MFMA on CDNA3 (#18887) by Benoit Jacob · 5 months ago
  79. 2291b38 Support 8-bit floats in the compiler. (#18886) by Benoit Jacob · 5 months ago
  80. a762328 Support 8-bit floats in the runtime (#18885) by Benoit Jacob · 5 months ago
  81. abe3f89 Add conversions for 1x1 conv_2d to matmul (#18736) by Ian Wood · 5 months ago
  82. c3fae2f [LLVMGPU] Use forall workgroup distribution in TileAndFuse pipeline (#18565) by Max191 · 5 months ago
  83. 4d20b82 Emit an error when affinity analysis fails. (#18883) by Ben Vanik · 5 months ago
  84. 9f5610d Preserving `nosideeffects` on func.func -> util.func import. (#18882) by Ben Vanik · 5 months ago
  85. e1469b2 [Codegen] Add pass to decompose pack unpack ops at dispatch boundaries (#18852) by Max191 · 5 months ago
  86. 9c5b57a Use FetchContent for both pybind11 and nanobind. (#18872) by Stella Laurenzo · 5 months ago
  87. 00104b5 Allow dynamic dimensions during folding of `tensor.expand_shape/collapse_shape` into `flow.dispatch.tensor.load/store`. (#18873) by MaheshRavishankar · 5 months ago
  88. a400cde [ROCM][NFC] Add option to control SLP vectorization in llvm optimizations (#18865) by Nirvedh Meshram · 5 months ago
  89. 563b3e7 Document new external ONNX model and linalg operator test suites. (#18819) by Scott Todd · 5 months ago
  90. e3f2d47 Bump torch-mlir to 140cad5 and update TorchOnnxToTorch conversion pipeline (#18867) by Vivek Khandelwal · 5 months ago
  91. 81c8b25 [Codegen] Allow multiple reduction dimensions in VectorDistribution (#18868) by Kunwar Grover · 5 months ago
  92. b922a70 GPU data tiling: query the target's list of MMA intrinsics. Add FP8 test. (#18862) by Benoit Jacob · 5 months ago
  93. bb71f7d [Attention] Only clamp attention for low precision types (#18848) by Kunwar Grover · 5 months ago
  94. 4cc6671 [CPU] Limit vectorization tile sizes for SVE (#18846) by Cullen Rhodes · 5 months ago
  95. d0269f3 Integrate llvm-project @864902e9b4d8bc6d3f0852d5c475e3dc97dd8335 (#18843) by Max191 · 5 months ago
  96. 65101c0 [docs] Update iree-compiler install instructions (#18861) by Marius Brehler · 5 months ago
  97. c08362a GPU target parameters for data tiling. (#18839) by Benoit Jacob · 5 months ago
  98. 114a142 [LLVMGPU] Embed mma_intrinsic in to_layout and infer contraction's intrinsic from it. (#18842) by Stanley Winata · 5 months ago
  99. 66342ab Reland #18804 (#18840) by Maksim Levental · 5 months ago
  100. 556c945 [Codegen] Fix bug in IGEMM pass for non conv contractions (#18838) by Max191 · 5 months ago