1. 8ba420d Bump sarisia/actions-status-discord from 1.15.1 to 1.15.2 in the github-actions group (#19744) by dependabot[bot] · 7 weeks ago
  2. 88887e7 Add ubuntu-24.04-arm runtime and runtime_tracing CI jobs. (#19724) by Scott Todd · 7 weeks ago
  3. 4c600ca [Codegen] Refactor RemoveSingleIteratinLoop to use ValueBoundsOpInte… (#19678) by Krzysztof Drewniak · 7 weeks ago
  4. 9aae362 [Codegen] Sprinkle in PropagateDispatchSizeBounds passes (#19677) by Krzysztof Drewniak · 7 weeks ago
  5. a64d713 Implement ValueBoundsOpInterface on HAL ID/count ops, util.assume.int (#19676) by Krzysztof Drewniak · 7 weeks ago
  6. db129e5 Update torch-mlir to llvm/torch-mlir@f42c7e4 (#19736) by zjgarvey · 7 weeks ago
  7. be8e3d2 Convert barriers into copies during allocation (#19735) by Rob Suderman · 8 weeks ago
  8. 6052a1d Bump to llvm/llvm-project@3f1486f (#19683) by MaheshRavishankar · 8 weeks ago
  9. b4c7de6 [Python] Enable building Python bindings as editable wheels, document it (#19716) by Krzysztof Drewniak · 8 weeks ago
  10. b08d152 [HAL] Use util.assume.int for memref alignments (#19691) by Krzysztof Drewniak · 8 weeks ago
  11. 4d3f06a [VectorDistribution] Clone vector.step on layout conflict (#19732) by Kunwar Grover · 8 weeks ago
  12. c5bc37f [Util] Fix OptimizeIntArithmetic pattern failure condition (#19731) by Kunwar Grover · 8 weeks ago
  13. e1f010c [Dispach] Clone chain of ops into dispatch (#19723) by Ian Wood · 8 weeks ago
  14. f31cc72 Update resource placement and transfer for barrier operations (#19725) by Rob Suderman · 8 weeks ago
  15. 75c9e86 [GPU] Avoid fusing slices of already tiled ops (#19404) by Max191 · 8 weeks ago
  16. c1cc4cc [LLVMGPU] Add pass to distribute undistributed copies to threads (#19715) by Quinn Dawkins · 8 weeks ago
  17. dde5992 [Codegen] Allow memref type propagation through collapse_shape (#19400) by Max191 · 8 weeks ago
  18. 36c2353 [Codegen] Push up the extract slice op (#19680) by Prashant Kumar · 8 weeks ago
  19. 08b44e2 [hip] Try again to fix the semaphore busy loop. (#19712) by Andrew Woloszyn · 8 weeks ago
  20. 6f33cd4 [Stream] Specialize encoding for TensorPhaseOp that have result_encoding (#19707) by Han-Chung Wang · 8 weeks ago
  21. 3e8c81c Remove legacy sync path (#19714) by Rob Suderman · 8 weeks ago
  22. f6f6388 [Codegen] Add workgroups reordering to distribute using forall (#19681) by Prashant Kumar · 8 weeks ago
  23. 5ee9b27 Clean up encoding-related code. NFC. (#19717) by Jakub Kuderski · 8 weeks ago
  24. 3032df2 Fix newlines in markdown mermaid.js diagrams. (#19657) by Scott Todd · 8 weeks ago
  25. c285d58 Copy sample code into samples/dynamic_shapes/README.md. (#19699) by Scott Todd · 8 weeks ago
  26. 3c95042 Re-enable MI250 workflows. (#19705) by saienduri · 8 weeks ago
  27. 27e7a90 [DT][Encoding] Use layouts to calculate storage size when it is present. (#19686) by Han-Chung Wang · 8 weeks ago
  28. a953763 Temporarily Disable MI250 workflow due to machine outage (#19702) by Akansha Bansal · 8 weeks ago
  29. c320935 Bump dawidd6/action-download-artifact from 3.1.4 to 7 in the github-actions group (#19692) by dependabot[bot] · 8 weeks ago
  30. 6fd0fd0 [LinalgExt] Implement PartialReductionOpInterface for OnlineAttentionOp (#19684) by Kunwar Grover · 8 weeks ago
  31. 3c963dd Update PyTorch sample notebooks using latest iree-turbine code. (#19658) by Scott Todd · 8 weeks ago
  32. 01c9f14 [LLVMGPUVectorDistribute] Add support for inter-subgroup multi_reduction (#19596) by Manupa Karunaratne · 8 weeks ago
  33. 21b0101 [GPU] Disable prefetching for loops with no computation (#19695) by Nirvedh Meshram · 8 weeks ago
  34. 8d1d867 [GPU] Add thread tile size inference for scatter (#19694) by Quinn Dawkins · 8 weeks ago
  35. 158c636 Revert "Increase default threshold of TileLargeTensor pass (#19671)" (#19693) by Nirvedh Meshram · 8 weeks ago
  36. 3e34e03 Bump the github-actions group with 8 updates (#19689) by dependabot[bot] · 8 weeks ago
  37. 3978ce6 Increase default threshold of TileLargeTensor pass (#19671) by Nirvedh Meshram · 8 weeks ago
  38. 2452b22 [Codegen][GPU] Let integer range optimization narrow GPU computations to i32 (#19473) by Krzysztof Drewniak · 8 weeks ago
  39. 2b29155 Update GH actions with Dependabot (#19663) by Marius Brehler · 8 weeks ago
  40. 9b35412 Run on schedule in iree-org only (#19685) by Marius Brehler · 8 weeks ago
  41. d90c505 Reshape propagation to enable broadcast(transpose) -> attention(q, kt, vt) fusion. (#19661) by MaheshRavishankar · 8 weeks ago
  42. cac7a96 Update IREE test suite to use iree-org/iree-test-suites@c47d13c (#19617) by MaheshRavishankar · 8 weeks ago
  43. 40c19e3 Better support multidevice placement with `stream.async.barrier` (#19651) by Rob Suderman · 8 weeks ago
  44. 88d5f59 Update PkgCI test_amd to use MI300x conductor cluster (#19517) by yamiyysu · 8 weeks ago
  45. ae50c5e [DOCS] Update VectorExt::NestedLayoutAttr docs (#19246) by Manupa Karunaratne · 8 weeks ago
  46. 1441caa Enable macOS Tracy CI build. (#19668) by Scott Todd · 8 weeks ago
  47. a583b25 [GPU] Teach GPUApplyTilingLevel PartialReduction tiling (#19682) by Kunwar Grover · 9 weeks ago
  48. 9f93691 [LLVMGPU] Use LLVMGPUDistribute for small input scatters (#19670) by Quinn Dawkins · 9 weeks ago
  49. f7a2157 Remove Upcasting schedule from TileAndFuse (#19669) by Nirvedh Meshram · 9 weeks ago
  50. 039b8b4 Using tracy::GetQueue instead of the sketchy static variable reference. (#19653) by Ben Vanik · 9 weeks ago
  51. 1d91bec Supporting file descriptors in iree_io_stream_open. (#19665) by Ben Vanik · 9 weeks ago
  52. 106371d Bump torch-mlir to f92c587cb6150e73078f32cf847dc3892be16f93 (#19659) by jinchen · 9 weeks ago
  53. a88555c Add macOS workflow running on M1 (#19656) by Marius Brehler · 9 weeks ago
  54. e64cb12 Increase strictness of global isel use for ROCM (#19247) by Tres · 9 weeks ago
  55. 2aca091 [Codegen][Nearly NFC] Move PropagateDispatchSizeBounds to Common/ (#19650) by Krzysztof Drewniak · 9 weeks ago
  56. 6245db1 [Stream] Attach layouts to tensor ops in encoding specialization pass. (#19649) by Han-Chung Wang · 9 weeks ago
  57. c793f90 [i1] Implement `packed_storage` layout encoding attribute (#19354) by lialan · 9 weeks ago
  58. 801e2c1 Expand runtime_tracing job to include Windows and macOS. (#19655) by Scott Todd · 9 weeks ago
  59. 7d21c5d Revert (2nd) of "Propagate reshapes through generics with reduction" (#19647) by MaheshRavishankar · 9 weeks ago
  60. b3ff1ed Rename `unroll_{m,n,k}` to `intrinsics_{m,n,k}` (#19652) by Benoit Jacob · 9 weeks ago
  61. 6d6bd6e [runtime] Fix runtime tracing compile failure on gcc (#19642) by Ian Wood · 9 weeks ago
  62. bb1c561 Erase all address spaces and get inlined ukernels (#19646) by Benoit Jacob · 9 weeks ago
  63. a7bac5d [Flow] Fix dispatch naming for dynamic shaped fusions (#19439) by Quinn Dawkins · 9 weeks ago
  64. 9055c9d [hip] Fix race in the cleanup of queue read operations. (#19645) by Andrew Woloszyn · 9 weeks ago
  65. 82e37d6 Fix (cross) compiling for 32-bit targets (#19644) by Marius Brehler · 9 weeks ago
  66. 02d145e [Stream] Implement SpecializeEncodings pass (1/n) (#19502) by Han-Chung Wang · 9 weeks ago
  67. 74f8d3c [LinalgExt] Scatter fusion by expansion 3/3 (#19588) by Ian Wood · 9 weeks ago
  68. 2347d9f Supporting (and renaming) IREE_HAL_WHOLE_BUFFER in binding table resolve. (#19640) by Ben Vanik · 9 weeks ago
  69. 126f0ac Add docs for updating release git tags manually. (#19637) by Scott Todd · 9 weeks ago
  70. af416b3 Bump version to 3.2.0 after releasing 3.1.0. (#19638) by Scott Todd · 9 weeks ago
  71. c484058 [GPU] Add barriers when resolving GPUMappedForall to fix race condition (#19635) by Nirvedh Meshram · 9 weeks ago
  72. 9b4906e [DispatchCreation] Drop fusion restriction for stride != 1 conv (#19634) by Quinn Dawkins · 9 weeks ago
  73. c75b686 [GPU][Codegen] Allowing mfma for narrow problem config sizes (#19615) by Zhuoran Yin · 9 weeks ago
  74. 7b9aa28 When dumping intermediates, dump how to reproduce the `.optimized.ll` (#19633) by Benoit Jacob · 9 weeks ago
  75. be75a30 Update minor Python versions used to build packages (#19632) by Marius Brehler · 9 weeks ago
  76. fb21dd6 Adding experimental Tracy API for TLS-less event recording. (#19625) by Ben Vanik · 9 weeks ago
  77. a5c3879 Reapply "Propagate reshapes through generics with reduction… (#18968) by Ian Wood · 9 weeks ago
  78. 80cbf6b [GPU] Add a pass to convert accumulating GEMMs to GEMMs (#19587) by Nirvedh Meshram · 9 weeks ago
  79. 550d88e [GPU] Add lowering configuration logic for scatter (#19624) by Quinn Dawkins · 9 weeks ago
  80. 349026b Add explicit tolerances to SDXL benchmark test times. (#19628) by Scott Todd · 9 weeks ago
  81. 9a83239 [GPU] Add chained reshape support for scf.forall expand destination pattern (#19597) by Nirvedh Meshram · 9 weeks ago
  82. 7047cc3 Rollup of minor runtime fixes/cleanup from the AMDGPU branch. (#19621) by Ben Vanik · 9 weeks ago
  83. aa06523 [NFC] Comment fixes in iree_bitcode_library. by Ben Vanik · 9 weeks ago
  84. 66723e4 Cleaning up null HAL driver options. by Ben Vanik · 9 weeks ago
  85. 2199c1d Adding iree_arena_block_pool_preallocate. by Ben Vanik · 9 weeks ago
  86. ea462c8 Removing some IREE_RETURN_AND_END_ZONE_IF_ERROR usage that was ugly. by Ben Vanik · 9 weeks ago
  87. 4a04c0a Adding minor iree/base/ time, string view, and memory utilities. by Ben Vanik · 9 weeks ago
  88. a8f7a32 Adding iree_hal_queue_affinity_* utilities. by Ben Vanik · 9 weeks ago
  89. c9fb739 Fixing HAL driver CTS test to not assume numerical indices exist. by Ben Vanik · 9 weeks ago
  90. 1ccabe5 Adding COMPILER_TARGET_DEVICE to iree_hal_cts_test_suite. by Ben Vanik · 9 weeks ago
  91. d517661 [runtime][python] Add debug sink to bindings (#19013) by Boian Petkantchin · 9 weeks ago
  92. c97b084 Including the .kd symbol suffix in AMDGPU executables. by Ben Vanik · 9 weeks ago
  93. d224220 Bump LLVM to llvm/llvm-project@21edac25f09faee23015c6a69d95fcbda287efe2 (#19616) by MaheshRavishankar · 9 weeks ago
  94. b245e6b Delete test_models job using SHARK-TestSuite/iree_tests. (#19614) by Scott Todd · 9 weeks ago
  95. 1445cef Set MLIR_LINK_MLIR_DYLIB to not link shared libMLIR (#19613) by Marius Brehler · 9 weeks ago
  96. 340ffbb [LinalgExt] Drop the unit dims on scatter ops 2/3 (#19450) by Ian Wood · 9 weeks ago
  97. 0820f10 [hip] Don't join the status in dispatch_thread. (#19583) by Andrew Woloszyn · 9 weeks ago
  98. cdf24b9 [Dispatch] Two fixes for CollapseDimensionsPass (#19598) by Ian Wood · 9 weeks ago
  99. 763406f [Codegen][Tuner] skip linking based on the default entry point attribute (#19603) by Bangtian Liu · 9 weeks ago
  100. c992d29 [runtime][hip] Fix format errors and conflicting types. (#19607) by Han-Chung Wang · 9 weeks ago