1. 0a00b01 [LLVMGPU] Landing VectorDistribution pipeline for attention (#17773) by Stanley Winata · 8 months ago
  2. 907b2cd [Codegen] Ensure hoisted extraction replaced by induction var. (#17975) by Stanley Winata · 8 months ago
  3. 0d0b989 Collapse dims when producer is unpack op (#17725) by Ian Wood · 8 months ago
  4. b0512e2 [GPU][VectDist] Refactor multiReducOp lowering to reduce acc at the end. (#17974) by Stanley Winata · 8 months ago
  5. 5b112cb [Codegen] Remove use of designated initializers (#17968) by Benjamin Maxwell · 8 months ago
  6. c058c84 Disable `tsan`, `gcc`, `debug`, and `byo_llvm` jobs. (#17967) by Scott Todd · 8 months ago
  7. 2380a08 [Codegen] Do not consider parallel regions in bufferization analysis (#17757) by Max191 · 8 months ago
  8. 8b83425 [Codegen] Add vector transfer + slice foldings in GenericVectorization (#17613) by Max191 · 8 months ago
  9. 3e3d9da Integrate llvm-project @97c0dbe1ad6dacbcca84e63e9d726b85b65af4fe (#17946) by Avinash Sharma · 8 months ago
  10. 5ea0b21 [Codegen] Add interface tensor reshape foldings to TileAndDistribute (#17758) by Max191 · 8 months ago
  11. 4a13331 Delete some no longer used files under `build_tools/benchmarks/`. (#17952) by Scott Todd · 8 months ago
  12. ce4f0fe Disable build_test_all_arm64 job until we find runners again. (#17964) by Scott Todd · 8 months ago
  13. 71d58a3 Disable cross compilation tests for now (#17961) by Jacques Pienaar · 8 months ago
  14. 2cf4670 [Im2col] Add pass to convert conv_2d ops into GEMM with im2col op (#17956) by Max191 · 8 months ago
  15. 76cad82 [LinalgExt] Retire `LinalgExt::ReverseOp` (#17866) by lialan · 8 months ago
  16. 69900ee [Codegen][GPU] Use bufferization.alloc_tensor for gpu.shuffle_tensor destination (#17940) by Max191 · 8 months ago
  17. cfc79ea [Codegen] Support inferring scalable vector sizes (#17891) by Benjamin Maxwell · 8 months ago
  18. 57361bc [hal][hip] Use stream command buffer in parameter initialization (#17960) by Lei Zhang · 8 months ago
  19. 8b44f61 Enable Python bindings builds/tests in 'runtime' CI builds. (#15878) by Scott Todd · 8 months ago
  20. 3fca22a [python] Read default flags from the IREE_PY_RUNTIME_FLAGS env var. (#17959) by Stella Laurenzo · 8 months ago
  21. 6e9d4cf [Codegen][GPU] Move PackToIntrinsics after workgroup tiling (#17950) by Quinn Dawkins · 8 months ago
  22. 6e23ed1 Updated `linalg_ext.online_attention` for `fp8` support (#17808) by Rob Suderman · 8 months ago
  23. 33e5430 Disable arm64 runtime build while runner availability is low. (#17927) by Scott Todd · 8 months ago
  24. 7def2ab [hip] Use a thread to wait for events instead of hipLaunchHostFunc (#17925) by Andrew Woloszyn · 8 months ago
  25. b85fdf5 Update disk image versions (#17939) by Nancy Yuen · 8 months ago
  26. 364b43d Skip samples/custom_dispatch/cuda/kernels/ on MSVC. (#17934) by Scott Todd · 8 months ago
  27. c628be0 Update new VM disk image versions. (#17936) by Nancy Yuen · 8 months ago
  28. 1fae64b Disable a100 tests and benchmarks. (#17932) by Scott Todd · 8 months ago
  29. 37a3db2 Integrate llvm-project @9372a3b70cf3969dac2d1a14cf41358205944e60 (#17926) by Max191 · 8 months ago
  30. 7ce8c8e [Preprocessing] Add a one-off pattern to fuse attention with transpose. (#17901) by MaheshRavishankar · 8 months ago
  31. 4de493a [CPU] Enable mmt4d ukernels when iree-llvmcpu-enable-ukernels is not set (#17928) by Han-Chung Wang · 9 months ago
  32. 30e2c20 Integrate llvm-project @266a5a9cb9daa96c1eeaebc18e10f5a37d638734 (#17911) by Avinash Sharma · 9 months ago
  33. 3dffadb Pulling misc iree_io_* fixes/cleanups from #15983. (#17914) by Ben Vanik · 9 months ago
  34. dbd2477 Adding `util.cmp.ne` and `#util.null`. (#17913) by Ben Vanik · 9 months ago
  35. ecab8f6 Switch more ci.yml jobs to optional based on paths changed. (#17923) by Scott Todd · 9 months ago
  36. 781be38 Add torch-fuse-quantized-ops pass to the torch-to-iree pipeline (#17908) by zjgarvey · 9 months ago
  37. 6a82eb5 Add F8_16x16x32_F32 support for MFMA (#17792) by Stanley Winata · 9 months ago
  38. c12d066 [Tools] Register ArmSVE and ArmSME dialects (#17887) by Benjamin Maxwell · 9 months ago
  39. 058432d [Flow] Add pass statistics to `ConvertDispatchRegionsToWorkgroups`. (#17900) by MaheshRavishankar · 9 months ago
  40. 6b87a9f Fixing global symbol replacement when referenced by non-load ops. (#17912) by Ben Vanik · 9 months ago
  41. 2eea916 Disable Android benchmarks while device is offline. (#17910) by Scott Todd · 9 months ago
  42. f89a7da [hip][cuda] Fix tracing in deferred streams after a bad merge. (#17909) by Andrew Woloszyn · 9 months ago
  43. 2912a2a [NFC][Flow] Remove use of fusion preprocessing when it isnt a preprocessing (#17899) by MaheshRavishankar · 9 months ago
  44. 0bc1518 Log more context when sdxl benchmark commands fail. (#17907) by Scott Todd · 9 months ago
  45. a56975d Bump torch-mlir to 5e4f00acb13f3f849a05e5ac28ee39307a5fdbff. (#17885) by saienduri · 9 months ago
  46. c322d28 [hip][cuda] Update event allocation and collection. (#17603) by Andrew Woloszyn · 9 months ago
  47. 7dafb0e [LLVMCPU] Populate index to LLVM conversion patterns (#17886) by Benjamin Maxwell · 9 months ago
  48. 5943426 Bump test suite to include punet tests and fix xfail script. (#17897) by Scott Todd · 9 months ago
  49. 9d6b425 [spirv] Push GPU target conversion to before SPIR-V conversion (#17816) by Lei Zhang · 9 months ago
  50. 2ed3f92 Add nop pass to different backend. by Alan Li · 9 months ago
  51. be461bd [LLVMGPU] Support CastTypeToFitMMA on TransformDialect script. (#17884) by Stanley Winata · 9 months ago
  52. 44808e1 Add in-tree special_models test suite using reworked iree-tooling. (#17883) by saienduri · 9 months ago
  53. 4035603 [LLVMGPU][VectorDist] Enable support to distribute vector.transfer_write with non-contiguous dims (#17895) by Stanley Winata · 9 months ago
  54. 02c2000 Revert "[LLVMGPU][ROCm] Add MFMA_F32_16x16x4_F32 instruction" (#17894) by Scott Todd · 9 months ago
  55. 10dfd9d3 [Flow] Improve dispatch name categorization around broadcast/transpose (#17890) by Quinn Dawkins · 9 months ago
  56. d65c6d4 [LLVMGPU][ROCm] Add MFMA_F32_16x16x4_F32 instruction (#17847) by Prashant Kumar · 9 months ago
  57. 6df0372 Integrate llvm-project @56069ab1a35e74d0d8d632121e1891d41cb56a2d (#17862) by Vivian · 9 months ago
  58. f0d24cd [Global opt] add flag to generalize matmul ops (#17877) by Ian Wood · 9 months ago
  59. 05dfe0b Add a Flow specific canonicalizer pass (#17836) by Quinn Dawkins · 9 months ago
  60. f07c96c Round up sdxl golden dispatch sizes and mi250 times by 10%. (#17879) by Scott Todd · 9 months ago
  61. 65a7bd0 Bump goldensize values for sdxl benchmarks. (#17873) by Scott Todd · 9 months ago
  62. 9d2d766 [LinalgExt] Adding IndexingMaps to linalg_ext.attentionOp (#17864) by Stanley Winata · 9 months ago
  63. 20d8308 [GPU] Fix the propagation control function logic. (#17869) by Han-Chung Wang · 9 months ago
  64. 429aafd [Codegen] Improve ROCm-specific LLVM translations (#17742) by Krzysztof Drewniak · 9 months ago
  65. 85e0da6 Adding some CTS variants for indirect command buffers. (#17846) by Ben Vanik · 9 months ago
  66. c1611cd Deflake LoopTest.WaitAnyBlocking and WaitAllBlocking. (#17863) by Scott Todd · 9 months ago
  67. b67fef7 Bump goldentime values for mi250 sdxl benchmarks. (#17860) by Scott Todd · 9 months ago
  68. f7f930d Deflake LoopTest.WaitOneBlocking by increasing timeout. (#17857) by Scott Todd · 9 months ago
  69. 548be86 Deflake ScopeTest.WaitIdleFailure by increasing sleep time. (#17859) by Scott Todd · 9 months ago
  70. 3b2c85b Drop unmaintained transform dialect tests (#17858) by Quinn Dawkins · 9 months ago
  71. 645c966 Disable failing Pixel6/Vulkan linalg_ext reverse test again. (#17861) by Scott Todd · 9 months ago
  72. e794ce8 Simplify tests/e2e/linalg_ext_ops. (#17856) by Scott Todd · 9 months ago
  73. d174e8b Disable failing Pixel6/Vulkan stablehlo_ops tests again. (#17851) by Scott Todd · 9 months ago
  74. 39940cb Switch code of conduct from OpenXLA to LF Projects. (#17853) by Scott Todd · 9 months ago
  75. 534928d [LLVMGPU] Add debug print for contraction problem size. NFC. (#17845) by Jakub Kuderski · 9 months ago
  76. 78c0051 Simplify tests/e2e/stablehlo_ops. (#17843) by Scott Todd · 9 months ago
  77. e000353 Adding indirect command buffer emulation for Metal. (#17849) by Ben Vanik · 9 months ago
  78. 9ac1015 Enable MI300 CI testing. (#17842) by saienduri · 9 months ago
  79. 6f25718 Switching HAL CTS to use TEST_F. (#17844) by Ben Vanik · 9 months ago
  80. 8513e5f Resolving binding references when applying deferred command buffers. (#17840) by Ben Vanik · 9 months ago
  81. b5a12ee Delete some unused files under `build_tools/`. (#17841) by Scott Todd · 9 months ago
  82. dcc8a0d Retaining binding tables in HIP/CUDA action queues. (#17839) by Ben Vanik · 9 months ago
  83. 8b8df59 Integrate llvm-project @4f3c9dabecc6074f8455ca23ba70020d5c556e63 (#17827) by Vivian · 9 months ago
  84. 0e5474b Add macOS runtime build using GitHub-hosted macos-14 runner. (#17835) by Scott Todd · 9 months ago
  85. f8f2996 Retaining binding tables and plumbing indirect cmds in local-task. (#17838) by Ben Vanik · 9 months ago
  86. 94ecb8b [NFC] Modify method for characterizing bit-extension operations to handle charecterization of bit-truncation as well. (#17833) by MaheshRavishankar · 9 months ago
  87. 4d204ea [NFC] Rename `FusionOfTensorOps` to `FuseMultiUseElementwiseProducer`. (#17828) by MaheshRavishankar · 9 months ago
  88. 0c90e5e Fixing Metal build break. by Ben Vanik · 9 months ago
  89. e24ea82 [Flow][Global Opt] Fold unit dims of `stream.parameter.named` (#17824) by Ian Wood · 9 months ago
  90. 3f6bf8c `EncodingAttr` to track type of source op (#17756) by lialan · 9 months ago
  91. d2895c2 Actually implement bytecode verifier IREE_VM_VERIFY_REG_ANY. (#17829) by Ben Vanik · 9 months ago
  92. 9ffe473 Making HAL command buffers take buffers as indirect args. (#17730) by Ben Vanik · 9 months ago
  93. 1d1b35f Remove .devcontainer. (#17826) by Stella Laurenzo · 9 months ago
  94. 7a29e8e Switch Linux x86 package builds to use standard runners. (#17822) by Scott Todd · 9 months ago
  95. 96c9bfb Optimizing queries for optional VM functions. (#17823) by Ben Vanik · 9 months ago
  96. 129878f Generalizing task system queuing and supporting callbacks. (#17820) by Ben Vanik · 9 months ago
  97. bb3fb88 [GlobalOpt] Add pattern to raise insert_slice(%x, constant) to tensor.pad (#17814) by Quinn Dawkins · 9 months ago
  98. 68b00aa Fixing task system worker tracy utilization plot on exit. (#17821) by Ben Vanik · 9 months ago
  99. 64a135d [CPU] Add missing BitCast lowering patterns. (#17810) by Han-Chung Wang · 9 months ago
  100. bedd58d Expose llvm dialect in python bindings (#17762) by harsh-nod · 9 months ago