1. 718b4fd [docs][pytorch] Add examples for compiling with external weights. (#18658) by Vinayak Dev · 10 hours ago main
  2. 206c1f2 [Codegen] Allow vectorizing linalg.copy ops on memrefs (#18672) by Quinn Dawkins · 12 hours ago
  3. 903ab0a Integrate LLVM at 9fa55ec3 (#18670) by Benoit Jacob · 12 hours ago
  4. cd48b10 [NFC] Delete dead ops after cloning (#18669) by Ian Wood · 15 hours ago
  5. 7a2705d Bump stablehlo to `f7f8e4e35` and drop LLVM local reverts (#18668) by Benoit Jacob · 17 hours ago
  6. d341128 [ExternalInterfaces] Make fill non-hoistableLeafOp, hoist linalg init operands (#18634) by Max191 · 20 hours ago
  7. 66c3397 [docs] Update Python API docs (#18662) by Marius Brehler · 20 hours ago
  8. 916bb88 Stopped threads from holding a reference to themselves. (#18636) by Andrew Woloszyn · 21 hours ago
  9. 84ac47b [LLVMGPU] Switch LLVMGPUVectorDistribute to use iree_gpu.lowering_config (#18651) by Kunwar Grover · 21 hours ago
  10. 462ecb6 [torch] Materialize all derivable bounds and divisor information in the IR. (#18646) by Stella Laurenzo · 33 hours ago
  11. 8de9856 [LinalgExt] Add Interfaces for implementing fusion support for `iree_linalg_ext.custom_op`. (#18647) by MaheshRavishankar · 2 days ago
  12. 451ef71 [Codegen] Add pass for unrolling annotated for loops (#18641) by Quinn Dawkins · 2 days ago
  13. 9c39a29 [Codegen][GPU] Fix forall hositing arg use in single trip loops (#18657) by Nirvedh Meshram · 2 days ago
  14. c86b621 Switch build_package to arm-hosted runner. (#18656) by Scott Todd · 2 days ago
  15. 0a71ea5 [Codegen][GPU] Add iree_gpu.multi_mma op to PartitionableLoopsInterface (#18653) by Max191 · 2 days ago
  16. e45c570 Switch linux_arm64_clang workflow to arm-hosted runner. (#18643) by Scott Todd · 2 days ago
  17. 839f7f6 [GPU] Use shared memory for data tiled multi_mma ops (#18625) by Max191 · 2 days ago
  18. 4d31d89 [LinalgExt] Avoid rank reduction in Im2Col lowering pattern (#18637) by Quinn Dawkins · 2 days ago
  19. 0d65b6e Integrate LLVM at `a86e966a` (#18644) by Benoit Jacob · 2 days ago
  20. a7d84f9 [ROCm] Fix known target info for MI300A (#18648) by Jakub Kuderski · 2 days ago
  21. 20a7638 [ROCm] Always require `--iree-hip-target` (#18645) by Jakub Kuderski · 2 days ago
  22. 0b17c72 Add testing for punet model variations. (#18639) by saienduri · 3 days ago
  23. f87ae4e IREE Custom tilable op (`iree_linalg_ext.custom_op`). (#18555) by MaheshRavishankar · 3 days ago
  24. b0ede80 [LLVMGPU] Add KernelConfig for data tiled multi_mma ops (#18623) by Max191 · 3 days ago
  25. 6c47f63 Update references to nod-ai/SHARK-* repository names. (#18592) by Scott Todd · 3 days ago
  26. 5502ca4 [Codegen][GPU] Fix index delinearized index order (#18640) by Quinn Dawkins · 3 days ago
  27. 35dafee [LLVMGPU] Remove more dead code from prefetching pass (#18638) by Jakub Kuderski · 3 days ago
  28. 12c653b Integrate LLVM at `bfde1783` (#18635) by Benoit Jacob · 3 days ago
  29. f5dc573 [DispatchCreation] CollapseDimensions patch (#18424) by Ian Wood · 3 days ago
  30. a9c7ec1 [Util][GPU] Add TiedOpInterface implementation for iree_gpu.multi_mma (#18626) by Max191 · 3 days ago
  31. b7ac442 [GPU][DT] Fix indexing bug in populateOperandOffsetsSizesStrides (#18624) by Max191 · 3 days ago
  32. 9e09115 Simplifications around narrow dimensions in encodings. (#18607) by Benoit Jacob · 5 days ago
  33. 34641dd Integrate LLVM at llvm/llvm-project@68ddd6c80e917b (#18619) by Han-Chung Wang · 6 days ago
  34. ff1b8b0 [Codegen][GPU] Make operand promotion controlled by lowering config (#18576) by Quinn Dawkins · 6 days ago
  35. 66bf9de [VectorExt] Fix to_layout op (#18621) by Ian Wood · 6 days ago
  36. 2e382a7 [docs] Update examples in the PyTorch+IREE guide (#18620) by Vinayak Dev · 6 days ago
  37. b5b4ab7 Bump cpubuilder dockerfile image to newer multi-arch version. (#18558) by Scott Todd · 6 days ago
  38. 8872710 Revert "Removed the iree_thread_join in the cleanup of deferred_work_queue.c" (#18616) by Andrew Woloszyn · 6 days ago
  39. 14728a7 Bump torch-mlir to 9938abf25e1e7526ca7f43a8c49e9078c14fc55c (#18615) by Vivek Khandelwal · 6 days ago
  40. 76c3e61 [CodeGen] Fix the argument replacements in scf.forall op lowering. (#18613) by Han-Chung Wang · 7 days ago
  41. 66d0c31 [DispatchCreation] Disable batch mmt4d fusion as its not supported by backends (#18611) by Nirvedh Meshram · 7 days ago
  42. 32a44bd [DispatchCreation][NFC] Simplify the logic of dumping CollapseInfo. (#18610) by Han-Chung Wang · 7 days ago
  43. 7db91ce [Codegen][GPU] Change the location of barriers in forall fusion (#18542) by Quinn Dawkins · 7 days ago
  44. c3fa4d0 Fix `task_executor_initialize` in resource exhausted scenario. (#18609) by Stella Laurenzo · 7 days ago
  45. 97896ce Integrate LLVM at llvm/llvm-project@24d707e215a1e2 (#18606) by Han-Chung Wang · 7 days ago
  46. d18064f [COMMON] Select the last compute op that has workgroup tilin… (#18604) by Prashant Kumar · 7 days ago
  47. 5a2dd56 Removed the iree_thread_join in the cleanup of deferred_work_queue.c (#18605) by Andrew Woloszyn · 7 days ago
  48. b13d38b [DT] Collapse matmul_narrow_M/N field into round_dims_to attribute. (#18599) by Han-Chung Wang · 7 days ago
  49. d583958 simplify `DataTiledMMAAttr::buildMmaOperation` (#18597) by Benoit Jacob · 8 days ago
  50. 672ae82 Attach Fusion interface to `linalg.softmax` (#18550) by Ian Wood · 8 days ago
  51. 6634f0f Integrate LLVM at llvm/llvm-project@cebb7c010854e3 (#18596) by Han-Chung Wang · 8 days ago
  52. 0b29f7b [GPU][DT] Add support for GPU data-tiling E2E tests. (#18591) by Han-Chung Wang · 8 days ago
  53. 7290283 Add F16 support for benchmark. (#18580) by erman-gurses · 8 days ago
  54. 3773a48 Lower data-tiled multi_mma to intrinsics. (#18547) by Benoit Jacob · 8 days ago
  55. 129ad45 Fixing cconv printing of function signatures with multiple tuples. (#18595) by Ben Vanik · 8 days ago
  56. 9158a90 GPU data tiling: Refine tile dimensions, more preparation for thread distribution. (#18556) by Benoit Jacob · 9 days ago
  57. b2dd6db [Encoding] Retire original_type field. (#18586) by Han-Chung Wang · 9 days ago
  58. 863ca01 Integrate LLVM at llvm/llvm-project@3fbf6f8bb183ad (#18590) by Han-Chung Wang · 9 days ago
  59. e6cf5bb [GPU][DT] Add support for materializing encoding ops with dynamic shape. (#18585) by Han-Chung Wang · 9 days ago
  60. e19950c Integrate LLVM at llvm/llvm-project@40d6497a97a61e (#18581) by Han-Chung Wang · 10 days ago
  61. ae6e5d3 [EmitC] Fix Windows builds (#18546) by Simon Camphausen · 10 days ago
  62. c0909a4 [gpu] Use clustered gpu.subgroup_reduce for nested layout distribution (#18515) by Andrea Faulds · 10 days ago
  63. 0d9c5a8 [GPU][DT] Add support for materializing tensor.empty and linalg.fill ops (#18563) by Han-Chung Wang · 10 days ago
  64. 9d7eb9f [docs] Fix AMDGPU to target chip mapping (#18584) by Jakub Kuderski · 10 days ago
  65. bddda85 Fix iree-compile command line call (#18583) by Marius Brehler · 10 days ago
  66. 070ec4a Add missing dep in LLVMGPU package to fix Bazel build. (#18582) by Scott Todd · 10 days ago
  67. ac03e05 [NFC][Codegen] Move LLVMCPUDropVectorUnitDims to Common (#18578) by Quinn Dawkins · 10 days ago
  68. 6fd9697 Remove build_tools/docker/ files. (#18566) by Scott Todd · 10 days ago
  69. 51329bf Migrate ci_linux_arm64_clang to new dockerfile. (#18569) by Scott Todd · 10 days ago
  70. b08cf02 [torchmlir-bump] Bump torch-mlir to 99848265c388 (#18579) by Gaurav Shukla · 10 days ago
  71. 328c32a Integrate LLVM at llvm/llvm-project@f264d9a9d56f (#18577) by Prashant Kumar · 10 days ago
  72. eef4623 [LLVMGPU][ROCm] Move kernel annotation before serialization (#18573) by Jakub Kuderski · 10 days ago
  73. 5a6bd8d [Codegen][GPU] Use I64ArrayAttr for tile sizes for simpler printing (#18575) by Quinn Dawkins · 11 days ago
  74. 9ee061d [LinalgExt] Masked Attention Implementation (#18525) by rohan-tan-bhowmik · 11 days ago
  75. 891f438 [Codegen] Add control options in pack unpack decomposition (#18469) by Max191 · 13 days ago
  76. d834aa7 [GPU] Add workgroup/subgroup scope specification to mma attr interface (#18548) by Max191 · 13 days ago
  77. 546d862 Fix experimental/web/ samples after recent changes. (#18567) by Scott Todd · 13 days ago
  78. 914858f [VectorDistribution] Reuse intrinsic layout in chained gemm (#18505) by Kunwar Grover · 13 days ago
  79. 0f15c8d [LLVMGPU][ROCm] Add validation on finalized llvm bitcode (#18552) by Jakub Kuderski · 13 days ago
  80. a5f63cc Move `linux_x64_bazel` job back to running on every commit. (#18560) by Scott Todd · 14 days ago
  81. 9588e7f Drop unused header from CombineBarrierRegions.cpp. (#18559) by Scott Todd · 14 days ago
  82. 782f372 [Codegen] Check for workgroup level tile sizes in workgroup tiling (#18538) by Max191 · 14 days ago
  83. 73ffafb [Codegen][GPU] Add support for bufferizing iree_gpu.barrier_region (#18497) by Quinn Dawkins · 14 days ago
  84. 75d5aab [Codegen][GPU] Add pass to combine adjacent barrier_region ops (#18541) by Quinn Dawkins · 14 days ago
  85. c9eca66 [Codegen][GPU] Allow iree_gpu.barrier_region to take multiple operands/results (#18490) by Quinn Dawkins · 14 days ago
  86. fa44a32 [LLVMGPU] Explicitly set configs for vector distribution pipeline lowering tests (#18553) by Kunwar Grover · 2 weeks ago
  87. 04144f6 [Codegen][GPU] Allow odd workgroup sizes when resolving non-warp foralls (#18549) by Quinn Dawkins · 2 weeks ago
  88. 0636abd Switch build_test_all_bazel to new dockerfile and runners. (#18533) by Scott Todd · 2 weeks ago
  89. 3a62d5c [GPU] Fix out of bounds access in setTileAndFuseLoweringConfig (#18537) by Max191 · 2 weeks ago
  90. 637190e [VectorDistribution] Add LICM to LLVMGPUVectorDistribute pipeline (#18510) by Kunwar Grover · 2 weeks ago
  91. f138e23 [Codegen] Add support for ParallelInsertSliceOp in DPS analysis (#18536) by Max191 · 2 weeks ago
  92. 337d49c [LinalgExt] Use f32 for accumulation for online_attention (#18456) by Kunwar Grover · 2 weeks ago
  93. 30b6374 [GPU][DT][NFC] Clenaup TODOs, styles and simplify logics. (#18544) by Han-Chung Wang · 2 weeks ago
  94. ad8f814 [LLVMGPU] Delete dead code in prefetch pass (#18543) by Jakub Kuderski · 2 weeks ago
  95. 6a44005 [GPU] Use alloca for private memory allocations (#18540) by Jakub Kuderski · 2 weeks ago
  96. 740e301 Preparation for data-tiled `multi_mma` codegen (#18532) by Benoit Jacob · 2 weeks ago
  97. 6fdc30f Start LLVM integrate integrates/llvm-20240917 (#18535) by Prashant Kumar · 2 weeks ago
  98. 7d823d2 [torch] Add dynamic support for `tm_tensor.attention` (#18527) by Rob Suderman · 2 weeks ago
  99. 898a95f Redirect links from GCP to Azure for RISCV/ARM files. (#18531) by Eliasj42 · 2 weeks ago
  100. f86a27d Bump torch-mlir to d6cf718 (#18530) by saienduri · 2 weeks ago