1. d5c6370 Make `iree_gpu.value_barrier` accept multiple operands (and return multiple results) (#18192) by Kunwar Grover · 7 months ago
  2. aeda149 [InputConversion] Switch to tablegen pass generation (#18245) by Marius Brehler · 7 months ago
  3. 10ba28d [Codegen][GPU] Add kernel config for LLVMGPUTileAndFuse (#17791) by Quinn Dawkins · 8 months ago
  4. 7cf3fc6 [Codegen][GPU] Fix allocation space in iree_gpu.shuffle_tensor lowering (#18250) by Quinn Dawkins · 8 months ago
  5. a884b93 Bump LLVM to llvm/llvm-project@ddda37a (#18258) by Stanley Winata · 8 months ago
  6. 0e152d2 [Codegen] Add `DeviceMappingAttr` that maps to workgroup IDs. (#18264) by MaheshRavishankar · 8 months ago
  7. 40258db [CodeGen][DT] Make the TypeConverter carry targetAttr info. (#18242) by Han-Chung Wang · 8 months ago
  8. d25712c [VMVX] Switch to tablegen pass generation (#18248) by Marius Brehler · 8 months ago
  9. 8a1d78b [Codegen][CPU] Enable scalable transfer lowerings (#18170) by Benjamin Maxwell · 8 months ago
  10. 551cd54 [TOSA] Switch to tablegen pass generation (#18227) by Marius Brehler · 8 months ago
  11. 878a99b [torch] Switch to tablegen pass generation (#18226) by Marius Brehler · 8 months ago
  12. 41f1f49 [Codegen] Add a pass option to control input -> dest pattern (#18240) by Quinn Dawkins · 8 months ago
  13. 78f54c2 [Codegen][GPU] Add a pass for basic distribution verification (#18236) by Quinn Dawkins · 8 months ago
  14. 66ed138 [CPU] Make VectorPreProcStrategy consider undefined behaviors (#18146) by lialan · 8 months ago
  15. b144e90 [test] Check depthwise conv is vectorized in test (NFC) (#18225) by Benjamin Maxwell · 8 months ago
  16. 53a7bc4 Replace `iree_compiler::VscaleRange` with `vector::VscaleRange` (NFC) (#18218) by Benjamin Maxwell · 8 months ago
  17. 300af39 [codegen] Add max_workgroup_counts to TargetWgpAttr (#17771) by Krzysztof Drewniak · 8 months ago
  18. 7d60397 [LinalgExt] Switch to new pass generation tablegen definitions. (#18216) by Han-Chung Wang · 8 months ago
  19. fe638b0 [Codegen][CPU] Eliminate all-true vector masks after vectorization (#18190) by Benjamin Maxwell · 8 months ago
  20. c71fe1a [WGSL][NFC] Switch to new pass generation tablegen definitions. (#18215) by Han-Chung Wang · 8 months ago
  21. 7cac1b2 [SPIRV] Switch to new pass generation tablegen definitions. (#18214) by Han-Chung Wang · 8 months ago
  22. a72e78b [LLVMGPU] Switch to new pass generation tablegen definitions. (#18213) by Han-Chung Wang · 8 months ago
  23. 0c2f51b [LLVMGPU] Drop WorkgroupSpecializationPass (#18212) by Nirvedh Meshram · 8 months ago
  24. 868f41e [ROCM] fix layout for WMMA_F16_16x16x16_F16 intrinsic (#18206) by Nirvedh Meshram · 8 months ago
  25. 08583d5 Bump LLVM to llvm/llvm-project@6b7afaa9db8f (#18197) by Stanley Winata · 8 months ago
  26. b297d5b [Codegen][GPU] Add bank conflict reduction pass to TileAndFuse (#18204) by Quinn Dawkins · 8 months ago
  27. 2ea9b14 [Codegen] Add support for memref.expand_shape to propagation util (#18202) by Quinn Dawkins · 8 months ago
  28. 9c951ca [Flow] Generalize horizontal contraction fusion to cover more cases. (#17880) by MaheshRavishankar · 8 months ago
  29. 7812c77 [Codegen][GPU] Add support for all other intrinsics to TileAndFuse (#18179) by Quinn Dawkins · 8 months ago
  30. 3901e62 [GPU][NFC] Update the comment of intrinsic format. (#18194) by Han-Chung Wang · 8 months ago
  31. ad2f0f8 [LLVMCPU] Add option `onlyFuseProducerInputOperands` to tileRootFuseConsumerProducer Pass (#18114) by Prashant Kumar · 8 months ago
  32. 6ac6be6 [GlobalOpt] Improve unary elementwise propagation to consider broadcasted operands (#17903) by Quinn Dawkins · 8 months ago
  33. 8dc6820 Adding simplified HAL dispatch methods. (#18189) by Ben Vanik · 8 months ago
  34. 3483893 [CodeGen][Common] Switch to new pass generation tablegen definitions. (#18166) by Han-Chung Wang · 8 months ago
  35. 49198a9 [EmitC] Remove unsused code from builders (#18191) by Simon Camphausen · 8 months ago
  36. 67b0e25 Remove leftovers from the old CppEmitter (#18175) by Marius Brehler · 8 months ago
  37. b06bf6a [Codegen] Query `#iree_gpu.target` for shared memory limit (#18184) by Nithin Meganathan · 8 months ago
  38. 5a48912 [GPU] Add check for contractionOpInterface in setMatmulLoweringConfig (#18178) by Max191 · 8 months ago
  39. ab12a4e [compiler][python] Make target_backends optional (#18151) by Boian Petkantchin · 8 months ago
  40. f0e8cda [Codegen][IGEMM] Add new pass for IGEMM transformation with reshape propagation (#18161) by Max191 · 8 months ago
  41. 1fddcd6 [Codegen][CPU] Add MaterializeEncoding conversions for parallel generic ops (#18071) by Max191 · 8 months ago
  42. 50f18f1 [NFC][Encoding] Outline encodings in lit tests (#18165) by Max191 · 8 months ago
  43. df3d588 Erase shape_assertion ops (#18167) by Jacques Pienaar · 8 months ago
  44. 6f88125 [Codegen] Lower `hal.interface.workgroup.size` in GPU codegen (#18145) by Nithin Meganathan · 8 months ago
  45. 2695fe9 [GlobalOpt] Switch to new pass generation tablegen definitions. (#18163) by Han-Chung Wang · 8 months ago
  46. 8545650 [MLIR][EmitC] Remove struct related macros from ops_emitc.h (#18081) by Simon Camphausen · 8 months ago
  47. 4bea50e [VMVX] Switch to new pass generation tablegen definitions (#18149) by Han-Chung Wang · 8 months ago
  48. 050a449 [CPU] Switch to new pass generation tablegen definitions (#18132) by Han-Chung Wang · 8 months ago
  49. 7ab66ff [Codegen][GPU] Move conversion to multi_mma to PackToIntrinsics (#18141) by Quinn Dawkins · 8 months ago
  50. 643f719 Add canonicalization pass for torch import (#18150) by Rob Suderman · 8 months ago
  51. e9e24f8 [GPU] Follow the official naming convention for WMMA attributes. (#18147) by Han-Chung Wang · 8 months ago
  52. 235e110 [Codegen][GPU] Add pass to expand multi_mma op shapes to intrinsic layout (#18139) by Max191 · 8 months ago
  53. 352e05f Integrate LLVM at llvm/llvm-project@f7b2c2e4 (#18143) by Han-Chung Wang · 8 months ago
  54. e341692 [Flow] Add pattern to canonicalize consecutive pads (#17878) by Quinn Dawkins · 8 months ago
  55. c067270 [Flow] Fix error in CollapseDimensionsPass (#18128) by Ian Wood · 8 months ago
  56. de679c9 Creating reusable command buffers in stream->hal lowering. (#18100) by Ben Vanik · 8 months ago
  57. cc5566c [stream] SinkAwaitToFirstConsumer could break domination (#18131) by Rob Suderman · 8 months ago
  58. 4716f68 [Codegen][DT] Remove tensor.pad logics entirely from materialization. (#18130) by Han-Chung Wang · 8 months ago
  59. 18e86ab [Codegen][GPU] Add tiling interface implementation for iree_gpu.multi_mma (#17984) by Quinn Dawkins · 8 months ago
  60. b76f89c [Codegen][GPU] Add producer fusion pattern to loop fusion and hoisting pass (#18118) by Quinn Dawkins · 8 months ago
  61. 82012e6 [GPU][NFC] Follow the official convention to define mfma/wmma attributes (#18127) by Han-Chung Wang · 8 months ago
  62. 71f1e20 Revert "Optimize `fp8` `linalg_ext.attention` by rework Q@K scaling" (#18112) by Stanley Winata · 8 months ago
  63. 5d8362c [Codegen][GPU] Add kernel config for LLVMGPUTileAndFuse for targeting mma (#18105) by Quinn Dawkins · 8 months ago
  64. 3d8ebc1 Integrate LLVM at llvm/llvm-project@b0329206 (#18102) by Han-Chung Wang · 8 months ago
  65. 98a9ca2 [Codegen] Support dynamic/scalable sizes when folding insert_slice into xfer_write (#17963) by Benjamin Maxwell · 8 months ago
  66. 1c50edd [LLVMGPU] Support i8 MFMA intrinsics in GPUTileAndFuse pipeline (#18104) by Max191 · 8 months ago
  67. 3a29039 [LLVMGPU] Remove redundant vector distribution tests (#18116) by Kunwar Grover · 8 months ago
  68. b324f2a [VectorExt] Teach vectorization to to_layout (#18092) by Kunwar Grover · 8 months ago
  69. e22b78d [GlobalOpt] Improve reshape/empty cleanup in transpose propagation (#17905) by Quinn Dawkins · 8 months ago
  70. 95fb6cb [LLVMCPU] Fix test (#18113) by Prashant Kumar · 8 months ago
  71. 113fae8 [LLVMCPU] Tile root and fuse consumer producer pass (#17804) by Prashant Kumar · 8 months ago
  72. 4a1f619 [Codegen][GPU] Add pass to unroll to native mma widths (#18101) by Quinn Dawkins · 8 months ago
  73. 2193406 Attaching pipeline layout to hal.interface.binding.subspan & co. (#18098) by Ben Vanik · 8 months ago
  74. 345e655 [EmitC][NFC] Use builder with default arguments for opaque_call ops (#17600) by Simon Camphausen · 8 months ago
  75. ba9ea85 [LLVMGPU] Add im2col pipeline for convolution codegen (#18086) by Max191 · 8 months ago
  76. 74790bd Fix assert syntax for macOS compiler builds. by Scott Todd · 8 months ago
  77. cddcd5b [Stream] Teach allocation of encoding to take bcast map into account (#18046) by Max191 · 8 months ago
  78. e9ee5fa [VectorExt] Move VectorExt from llvm-external-projects to Codegen/Dialect (#18082) by Kunwar Grover · 8 months ago
  79. d850037 [Flow] Extend CollapseDimensionsPass (#17993) by Ian Wood · 8 months ago
  80. 8ea1e23 [DT] Move SetEncoding from GlobalOptimization to Flow (#18054) by Max191 · 8 months ago
  81. 8f0909c [VectorDistribution] Split layout configuration and distribution (#18065) by Kunwar Grover · 8 months ago
  82. dcae2e7 Integrate LLVM at `e7630a0d` (#18078) by Benoit Jacob · 8 months ago
  83. 887c276 Fix `ConvertToNVVM` and drop local revert in `third_party/llvm-project` (#18064) by Benoit Jacob · 8 months ago
  84. 55e8228 [HAL] Remove the CPU dependency from HAL. (#18053) by Han-Chung Wang · 8 months ago
  85. e792c32 Improving global folding and IPO for immutable globals. (#18066) by Ben Vanik · 8 months ago
  86. d8d1407 [GPU] Fix offsets calculation formula in MultiMmaOp distribution. (#18055) by Han-Chung Wang · 8 months ago
  87. 91433fc Reland non-splat `tensor.from_elements` to `flow` (#18049) by Rob Suderman · 8 months ago
  88. 9f9f37b Disable `demoteF64ToF32` for JitGlobals (#18059) by Rob Suderman · 8 months ago
  89. 35e6c7c ConvertToLLVM depends on AffineDialect (#18062) by Benoit Jacob · 8 months ago
  90. 388ebd2 [VectorDistribution] Use to_layout to set anchors for LLVMGPUVectorDistribute pass (#18044) by Kunwar Grover · 8 months ago
  91. 2c53b4a Optimize `fp8` `linalg_ext.attention` by rework Q@K scaling (#18031) by Rob Suderman · 8 months ago
  92. 6c45bef [runtime][HIP] Retire ROCm HAL backend (#17029) by Nithin Meganathan · 8 months ago
  93. cb5f32d [VectorDistribution] Use to_layout operation to set anchors instead of attributes (#18028) by Kunwar Grover · 8 months ago
  94. 4c0a18a [DT] Retire UpperBoundTileSizeOp op and relevant passes. (#18045) by Han-Chung Wang · 8 months ago
  95. 18c183f [VectorDistribution] Replace layout_resolution with to_layout (#18027) by Kunwar Grover · 8 months ago
  96. 998ed49 [VMVX] Add support for arith.maxnumf and arith.minnumf lowering. (#18033) by Han-Chung Wang · 8 months ago
  97. 2c638f3 Adding `once` attribute to `stream.cmd.execute`. (#18043) by Ben Vanik · 8 months ago
  98. 742fd24 [Codegen][GPU] Add derived thread config implementation for im2col op (#17954) by Max191 · 8 months ago
  99. e7fad81 [Encoding] Add an optional bcast_map attribute to EncodingAttr. (#18032) by Max191 · 8 months ago
  100. 7361340 Adding `hal.device.memoize` op. (#17938) by Ben Vanik · 8 months ago