1. a6a56a9 Add `LinalgFusionInterface` to support fusion for linalg_ext ops (added `scatter` and `reverse`) (#17428) by Ian Wood · 10 months ago
  2. 3d1364e [Codegen][GPU] Add pattern to lower iree_gpu.multi_mma to intrinsics (#17457) by Quinn Dawkins · 10 months ago
  3. ab8f668 Revert "Data tiling: transpose narrow-N into narrow-M" (#17503) by Benoit Jacob · 10 months ago
  4. e33ca89 [LinalgExt] Split TileAndDecomposeAttention (#17468) by Kunwar Grover · 10 months ago
  5. 322d688 [Codegen][GPU] Add pattern to drop lead unit dims of multi_mma ops (#17456) by Quinn Dawkins · 10 months ago
  6. 117cb43 Test 'console' provider in 'tracing' job. (#16454) by Scott Todd · 10 months ago
  7. 16bdaa9 Data tiling: transpose narrow-N into narrow-M (#17446) by lialan · 10 months ago
  8. 6c75aa1 [Codegen][GPU] Allow iree_gpu.tensor_barrier to take vectors (#17479) by Quinn Dawkins · 10 months ago
  9. 1750e2b Integrate LLVM at 855eef2abd81cb8c7543d4748353d5e378fdd4c2 (#17501) by Benoit Jacob · 10 months ago
  10. 051c361 NFC: Make a few loop transformations more accessible (#17489) by Quinn Dawkins · 10 months ago
  11. cad02f9 [Codegen][GPU] Add unrolling pattern for iree_gpu.multi_mma (#17454) by Quinn Dawkins · 10 months ago
  12. 46c6bf5 [CPU] Add support for pack ukernel preparation. (#17472) by Han-Chung Wang · 10 months ago
  13. a842527 [Codegen][GPU] Drop dead PassDetail.h file (#17490) by Quinn Dawkins · 10 months ago
  14. 63dff03 [Codegen][GPU] Add iree_gpu.tensor_barrier op (#17478) by Quinn Dawkins · 10 months ago
  15. 31e1a30 [Codegen][GPU] Add dictionary based lowering config attribute (#17463) by Quinn Dawkins · 10 months ago
  16. 008add9 [CodeGen][NFC] Rename DecomposeBatchMmt4DOps to CPUPrepareUkernels. (#17471) by Han-Chung Wang · 10 months ago
  17. 30e0238 Bump LLVM to llvm/llvm-project@bd3f5a4bd3d9d7ee8ae801c24c5081073b20abd4 (#17470) by MaheshRavishankar · 10 months ago
  18. 9fe159d [LinalgExt] Generalize attention tiling interface implementation (#17408) by Kunwar Grover · 10 months ago
  19. 1316c92 [Codegen] NFC: Move the lowering config to an attribute interface (#17439) by Quinn Dawkins · 10 months ago
  20. 7813fd3 [CPU] Fix a distribution bug and limiting distribution tile sizes. (#17436) by Han-Chung Wang · 11 months ago
  21. d4aa849 [CPU] Add support for pack/unpack ukernels enablement on llvm-cpu path. (#17427) by Han-Chung Wang · 11 months ago
  22. 6c5198d Folding no-op stream.async.update ops away. (#17458) by Ben Vanik · 11 months ago
  23. 006af5d [GPU] Support specifying LLVMGPU backend target features (#17451) by Lei Zhang · 11 months ago
  24. a36773a [Codegen][GPU] Add vectorization pattern for iree_gpu.multi_mma (#17453) by Quinn Dawkins · 11 months ago
  25. f6a38ac [GPU] Thread through a common target description (#17217) by Lei Zhang · 11 months ago
  26. 62a996b [Codegen] Add lane distribution for scf.forall (#17373) by Quinn Dawkins · 11 months ago
  27. 080b1fa [Codegen][GPU] Add a contraction like operation for mma intrinsics (#17374) by Quinn Dawkins · 11 months ago
  28. e0f3c05 [Codegen][GPU] Change iree_gpu.shuffle_tensor to take a region for the read (#17425) by Quinn Dawkins · 11 months ago
  29. a3b74bc [CPU][ArmSME] Update tiling to use all SME accumulators (#16389) by Benjamin Maxwell · 11 months ago
  30. 6d95f8c Integrate LLVM at `74a87548` (clean) (#17423) by Ingo Müller · 11 months ago
  31. 4f8ee51 Moving demotion/promotion passes to input conversion. (#17422) by Ben Vanik · 11 months ago
  32. dece30e [CPU] Do not decompose pack/unpack ops on x86 backends. (#17366) by Prashant Kumar · 11 months ago
  33. f2fcbbf [iree][global] Add conv2d op to demote to bf16 pass (#17410) by Prashant Kumar · 11 months ago
  34. 3b5b70a Integrate LLVM at `1650f1b3` (clean) (#17418) by Ingo Müller · 11 months ago
  35. b4fc0b4 Implementing the f64 VM extension and flipping the flag by default. (#17416) by Ben Vanik · 11 months ago
  36. b716704 Update git-clang-format ref and clang-format version. (#16792) by Scott Todd · 11 months ago
  37. 8fcab13 [Flow] Improve annotation name for conv (#17417) by MaheshRavishankar · 11 months ago
  38. 356e2b7 [Codegen] Add op for flattening warp and thread ids of forall ops (#17368) by Quinn Dawkins · 11 months ago
  39. 90db41a [LLVMGPU] Add Winograd pipeline for LLVMGPU (#17302) by Max191 · 11 months ago
  40. 4021109 [Winograd] Add filtering by annotations for Winograd rewrites (#17332) by Max191 · 11 months ago
  41. 0260947 [GlobalOpt] Simplify the logic used to pick the groups. (#17405) by MaheshRavishankar · 11 months ago
  42. 9a294eb [Winograd] Use output_tile_size for more static output transform tiling (#17200) by Max191 · 11 months ago
  43. 748db31 Fuse Generic Ops Generated by `gather` Lowering (#17341) by Ian Wood · 11 months ago
  44. 428adf2 [LLVMGPU] Add debug prints for vector distribution config (#17404) by Jakub Kuderski · 11 months ago
  45. 2a8d681 [CPU] Remove CPUDoubleTilingPeelingExpert (#17329) by Andrzej Warzyński · 11 months ago
  46. 3bac7ec Add math expand patterns pass (#17395) by jinchen · 11 months ago
  47. 29a12f3 [Preprocessing] Remove `input=none` option from TransposeMatmulPass (#17364) by Benjamin Maxwell · 11 months ago
  48. 8d8d18c [LinalgExt] Simplify Attention unit tests (#17393) by Kunwar Grover · 11 months ago
  49. a8404a8 [LLVMGPU] Preserve config dictionary during MapNestedForallToGpuThreadsOp application (#17381) by Kunwar Grover · 11 months ago
  50. 2ed4778 Integrate LLVM at `a1d43c14d` (+1 revert) (#17380) by Benoit Jacob · 11 months ago
  51. 06eb43d Use coalesce loops (#17314) by MaheshRavishankar · 11 months ago
  52. 4f27e64 Generalize overriding llvm func attr flags in translation info (#17365) by Kunwar Grover · 11 months ago
  53. 2a701d5 [LLVMGPU] Add translation_info config knobs to disable passes (#17340) by Jakub Kuderski · 11 months ago
  54. 45ca23e [CPU] Take native_vector_size into accounts for attention op tiling. (#17349) by Han-Chung Wang · 11 months ago
  55. 3625c60 Revert "Add math expand patterns pass" (#17367) by Scott Todd · 11 months ago
  56. d657082 [LLVMGPU] Switch GPU passes to tablegen definitions. NFC. (#17361) by Jakub Kuderski · 11 months ago
  57. a9ca8e6 Add math expand patterns pass (#17324) by jinchen · 11 months ago
  58. 47a5f99 [Codegen][GPU] Move MFMA/WMMA constructors to interface method (#17356) by Quinn Dawkins · 11 months ago
  59. d2dd9e2 Replacing hal.tensor.export storage for hal.tensor.alias. (#17339) by Ben Vanik · 11 months ago
  60. 5337bd7 [Codegen] Add pattern for hoisting scf.forall from scf.for (#17312) by Quinn Dawkins · 11 months ago
  61. a3b7e12 Integrate both llvm-project@2083e97e (+1 :leftwards_arrow_with_hook:, +1 :cherries:) and torch-mlir@bce800a3 (#17330) by Benoit Jacob · 11 months ago
  62. c81496c [NFC][LinalgExt] Rename op functions from outdated naming conventions (#17333) by Max191 · 11 months ago
  63. 7baef75 [CPU] Add new attribute to control peeling (#17231) by Andrzej Warzyński · 11 months ago
  64. 035da66 [NFC] Fixing stray space and unneeded modules in some lit tests. (#17338) by Ben Vanik · 11 months ago
  65. fc3561c [NFC][LinalgExt] Move tiling tests and implementations from IR to Transforms (#17216) by Max191 · 11 months ago
  66. 8ae8aaf [Preprocessing] Skip skinny matmuls during PadToIntrinsics. (#17323) by Stanley Winata · 11 months ago
  67. 4f2f8cf [LLVMGPU] Fix MMA schedule validation for unaligned shapes (#17317) by Max191 · 11 months ago
  68. 2d5c811 [LLVMGPU] Remove non-useful vector_distribution_pipeline_test test (#17318) by Max191 · 11 months ago
  69. afb986e [LLVMGPU] Remove duplicate shared memory bank conflict pass (#17322) by Jakub Kuderski · 11 months ago
  70. 07d4fe6 [CPU] Integrate i8mm patterns from upstream (#17007) by Kojo Acquah · 11 months ago
  71. 25fd8a3 Fixing VM extui i1->i64 and adding extsi i1->i32/i64. (#17311) by Ben Vanik · 11 months ago
  72. 3d23684 [GPU] Introduce PadAndVectorDistribution lowering strategy. (#17234) by Han-Chung Wang · 11 months ago
  73. 273abbb [NFC] Cleaning up VM conversion patterns. (#17307) by Ben Vanik · 11 months ago
  74. 3277c21 [Codegen] Add pattern for lowering iree_gpu.shuffle_tensor (#17269) by Quinn Dawkins · 11 months ago
  75. b7aa3b7 [Codegen] Add an op for fusing forall ops (#17279) by Quinn Dawkins · 11 months ago
  76. 3ca0a49 Moving OutlineConstantsPass to flow and adding parameter support. (#17303) by Ben Vanik · 11 months ago
  77. 6e76f28 Removing affinity from stream.resource.size. by Ben Vanik · 11 months ago
  78. 1496270 Preserving affinities with hal.tensor.import/export lowerings. by Ben Vanik · 11 months ago
  79. ad79bc7 Attaching AffinityOpInterface to common early-phase ops. by Ben Vanik · 11 months ago
  80. 112fad0 Avoid folding globals with different dialect attrs. by Ben Vanik · 11 months ago
  81. d4e1924 [Preprocessing] Remove global transpose matmul option (#17300) by Benjamin Maxwell · 11 months ago
  82. 4c9cb3c [Codegen] Add iree_gpu.shuffle_tensor op (#17257) by Quinn Dawkins · 11 months ago
  83. d5e479d [LLVMGPU] Turn on the vector distribution pipeline by default (#17291) by Quinn Dawkins · 11 months ago
  84. 792f14d Move `EncodingAttr` and related ops from `LinalgExt` to a new `Encoding` dialect (#17277) by Benoit Jacob · 11 months ago
  85. 30aa5e5 [Winograd] Fix element type bug for Conv2DToWinograd with promotion (#17268) by Max191 · 11 months ago
  86. efed94f [Codegen] Add op for copying tensor operands (#17256) by Quinn Dawkins · 11 months ago
  87. 71a9945 Adding `--iree-opt-import-parameters=` and cleaning up export. (#16828) by Ben Vanik · 11 months ago
  88. e4ec8f1 [LLVMGPU] Follow-up to fix a bug in LLVMGPUPromoteMatmulToFitMMA pass. (#17264) by Han-Chung Wang · 11 months ago
  89. 8547374 [CPU] Limit unrolling factors for generic ops. (#17227) by Han-Chung Wang · 11 months ago
  90. 463fed4 Support JitGlobals on inline and outline flow dispatches. (#17259) by Ben Vanik · 11 months ago
  91. e6d8aa7 [LLVMGPU] Introduce a pass that pad matmul to fit mma shapes. (#17225) by Han-Chung Wang · 11 months ago
  92. 8f6ecc5 Hide ExecutableVariantOp from TargetBackend pipeline factory methods. (#17255) by Ben Vanik · 11 months ago
  93. e633d07 [GPU] Add option for no gpu.block_dim in GPUDistributeScfFor (#17214) by Max191 · 11 months ago
  94. 34449df [NFC] Refactoring MeshToFlow pass out from patterns. (#17245) by Ben Vanik · 11 months ago
  95. e15968f Misc cleanups to flow dialect files. (#17243) by Ben Vanik · 11 months ago
  96. 7e2dd20 [Winograd] Generate winograd.filter_transform op in ConvertConv2DToWinograd (#17106) by Max191 · 11 months ago
  97. f4a7df4 [Codegen] Make amdgpu_distribute_vectors return a handle (#17239) by Quinn Dawkins · 11 months ago
  98. 6233f4f Fixes to passes for custom dispatch to work with bf16 type (#17242) by Dave Liddell · 11 months ago
  99. f54a861 [VectorDistribution] Add distribution for scalar broadcast (#17248) by Kunwar Grover · 11 months ago
  100. 9719aa2 [cpu][codegen] Fix crash in case of complex dtype (#17247) by Prashant Kumar · 11 months ago