1. b07d60c Plumb through support for controlling subgroup size in CodeGen (#11388) by Lei Zhang · 2 years, 5 months ago
  2. 895cf38 [spirv] Add support for Winograd output op (#11409) by harsh-nod · 2 years, 5 months ago
  3. 7649620 Set maximum number of threads in the thread block for CUDA (#11387) by Guray Ozen · 2 years, 5 months ago
  4. 08aea95 Enable fusion for unpack + elementwise ops. (#11403) by Han-Chung Wang · 2 years, 5 months ago
  5. 16dbf3b [spirv] Change illegal configuration test to use user control (#11398) by Lei Zhang · 2 years, 5 months ago
  6. 42a4dc7 [spirv] Add support for Winograd input op (#11375) by harsh-nod · 2 years, 5 months ago
  7. e82cbe6 NFC: Separate setting translation info and dispatch configuration (#11400) by Lei Zhang · 2 years, 5 months ago
  8. 884e361 NFC: Add better error message. (#11401) by MaheshRavishankar · 2 years, 5 months ago
  9. a303b95 Do not try to fuse with ops that are cloned into dispatch regions anyway. (#11399) by MaheshRavishankar · 2 years, 5 months ago
  10. 5d8b054 [WebGPU] Push constants Storage i32 -> Uniform vector<4xi32>. (#11392) by Scott Todd · 2 years, 5 months ago
  11. 4ccfe79 [mlir][gpu] Pack and unpack to enable f16 and int8 warp reduce. (#11349) by Stanley Winata · 2 years, 5 months ago
  12. d110d1b Adopt the new memref lowering process (#11261) by qcolombet · 2 years, 5 months ago
  13. 0169788 Support distribution to more than 3 dimension at the workgroup level (#11385) by Thomas · 2 years, 5 months ago
  14. e986cdb Integrate at llvm/llvm-project@61aed52c and bump dependencies (#11393) by Thomas · 2 years, 5 months ago
  15. 8b19b3c Adopt the new memref lowering process by Quentin Colombet · 2 years, 5 months ago
  16. 503ce22 test maxntid by Guray Ozen · 2 years, 5 months ago
  17. e5a213a Cherry-pick some SPIR-V related MLIR commits (#11384) by Lei Zhang · 2 years, 5 months ago
  18. 9a22a37 [SPIRV] Add bank conflict reduction to cooperative matrix pipeline (#11386) by Quinn Dawkins · 2 years, 5 months ago
  19. 214847a Set maximum number of threads in the thread block for CUDA target by Guray Ozen · 2 years, 5 months ago
  20. 3bc3548 [spirv] Use a placeholder pointer type when creating resources (#11382) by Lei Zhang · 2 years, 5 months ago
  21. 129ae96 Enable fusion for elementwise Linalg op + pack op (#11374) by Han-Chung Wang · 2 years, 5 months ago
  22. 2f4225d [Flow] Add support for linalg.conv2d_nchw_fchw in img2col (#11369) by Quinn Dawkins · 2 years, 5 months ago
  23. 3236f51 Port the AArch64 tile size selection from the old `matmul-to-mmt4d` pass. (#11366) by bjacob · 2 years, 5 months ago
  24. 25f6bf2 Add winograd output op to LinalgExt (#11361) by harsh-nod · 2 years, 5 months ago
  25. 2046e25 Revert "Enable fusion for elementwise Linalg op + pack op" (#11372) by Han-Chung Wang · 2 years, 5 months ago
  26. 5260015 Enable fusion for elementwise Linalg op + pack op (#11284) by Han-Chung Wang · 2 years, 5 months ago
  27. 296d545 Enable GPU pipeline schedule with shared memory stores in stage 0 (#11125) by Quinn Dawkins · 2 years, 5 months ago
  28. b576330 Integrate at llvm/llvm-project@bf15f1e4 and bump dependencies (#11341) by Thomas · 2 years, 5 months ago
  29. 91b3086 Move CUDA llvm optimization to the new pass manager (#11348) by Thomas · 2 years, 5 months ago
  30. 634ca1a Add a test that compiles softmax under aggressive fusion. (#11362) by MaheshRavishankar · 2 years, 5 months ago
  31. 16ab7a6 Encode the matmul type triple in `TensorEncoding` (#11355) by bjacob · 2 years, 5 months ago
  32. 2137aaf Adding asynchronous module import support and samples. (#11326) by Ben Vanik · 2 years, 5 months ago
  33. df1c04b [NFC] Switch to use upstream linalg::getPrunedAttributeList method. (#11332) by Han-Chung Wang · 2 years, 5 months ago
  34. 061bed2 Adding SubrangeOperandOpInterface to better fold util.buffer.subspan. (#11340) by Ben Vanik · 2 years, 5 months ago
  35. 7efcd0a Fixing the --compile-to= flag and adding a test of all phases. (#11345) by Ben Vanik · 2 years, 5 months ago
  36. 9745585 Add expansion bubble-up to apply_patterns transform op (#11344) by Oleksandr "Alex" Zinenko · 2 years, 5 months ago
  37. 5db30c6 Letting FusionOfTensorOps apply patterns forever. (#11336) by Ben Vanik · 2 years, 5 months ago
  38. adadd03 Adding IREE_LINK_COMPILER_SHARED_LIBRARY cmake flag. (#11335) by Ben Vanik · 2 years, 5 months ago
  39. a229532 Cherry-pick llvm/llvm-project@c0321edc (#11337) by Thomas · 2 years, 5 months ago
  40. 86d7269 Use upstream SCF tile-and-fuse in CPU codegen (#10770) by Jerry Wu · 2 years, 5 months ago
  41. 54a5acf Make elementwise parallel operations use in-place updates only for non-vectorizable ops (#10648) by MaheshRavishankar · 2 years, 5 months ago
  42. 0135230 [GPU] Reduce the number of warp shuffles needed for reduction (#11319) by Thomas · 2 years, 5 months ago
  43. 45bd550 Add winograd input transform op to LinalgExt (#11228) by harsh-nod · 2 years, 5 months ago
  44. b8447d9 Always vectorize LinAlg ops for VMVX backend. (#11323) by Han-Chung Wang · 2 years, 5 months ago
  45. f433b19 Adding initial support for coarse-fences imports. by Ben Vanik · 2 years, 5 months ago
  46. 3792421 Adding ElideDeviceQueueBarrierOp pattern. by Ben Vanik · 2 years, 5 months ago
  47. 51d4dc6 EncodingInfo may depend on the shape, not just on the element type (#11300) by bjacob · 2 years, 5 months ago
  48. b2eda39 Consistently query `ExecutableTargetAttr` (#11291) by bjacob · 2 years, 5 months ago
  49. 79d9a3c [spirv] Do not introduce 2-D vectors in i64 emulation (#11316) by Jakub Kuderski · 2 years, 5 months ago
  50. df0e1ba [vm] Do not define all arithmetic ops as `Pure` (#11190) by Jakub Kuderski · 2 years, 5 months ago
  51. 38c5fc3 Cherry-pick llvm/llvm-project@e672f512 & llvm/llvm-project@234d2e27 (#11302) by Lei Zhang · 2 years, 5 months ago
  52. 7de28e3 [spirv] Allow per-vendor cooperative matrix tiling controls (#11306) by Lei Zhang · 2 years, 5 months ago
  53. c302444 Guard a test-only dependency with IREE_BUILD_TESTS. by Stella Laurenzo · 2 years, 5 months ago
  54. b8c1b79 Implement a compiler embedding API and shared library stub/loader. (#11294) by Stella Laurenzo · 2 years, 5 months ago
  55. dbbbae5 Guard test customization only if target exists. by Stella Laurenzo · 2 years, 5 months ago
  56. 7723971 Move the api-test binary to the project wide binary directory. (#11298) by Stella Laurenzo · 2 years, 5 months ago
  57. 7ed278e Disable packaging of the (removed) MHLO extension. (#11299) by Stella Laurenzo · 2 years, 5 months ago
  58. b07e252 Fix macos compiler dylib build. (#11296) by Stella Laurenzo · 2 years, 5 months ago
  59. b35c97f Allow encoding info to depend on elem type and target info. (#11290) by bjacob · 2 years, 5 months ago
  60. 7ec09ce Create explicit compiler C API libraries and enable dynamic linking on all platforms (#11285) by Stella Laurenzo · 2 years, 5 months ago
  61. 1814bc5 Handle `tensor.extract` of i1 types in type legalization. (#11274) by MaheshRavishankar · 2 years, 5 months ago
  62. 16d7747 Enable tile+distribute+vectorize for unpack ops. (#11232) by Han-Chung Wang · 2 years, 5 months ago
  63. 51dc5af Pass #hal.descriptor_type via MemRef memory space to backends (#11227) by Lei Zhang · 2 years, 5 months ago
  64. 323305c Integrate llvm-project at 5cfc22cafe3f and bump dependencies (#11275) by Kojo Acquah · 2 years, 5 months ago
  65. 0ba9a41 Fix setting of workgroup size for `tile_to_foreach_thread_and_workgroup_count_region` (#10673) by Guray Ozen · 2 years, 5 months ago
  66. e609a08 Add debug flags to IREEBufferizeOp (#11256) by Matthias Springer · 2 years, 5 months ago
  67. dbd3ed5 Use upstream SCF tiling in CPU codegen (#11268) by Jerry Wu · 2 years, 5 months ago
  68. a2b5a38 Change the attributes and APIs around simplify-extract-strided-metadata by Quentin Colombet · 2 years, 5 months ago
  69. a1f8536 Integrate llvm-project at 65644125beb7 and bump dependencies (#11265) by Kojo Acquah · 2 years, 5 months ago
  70. ae0e293 Option to ignore dynamic dims in stack allocation check (#11254) by Jerry Wu · 2 years, 5 months ago
  71. a30cc08 Cherry-pick llvm/llvm-project@9bb63374 (#11259) by Lei Zhang · 2 years, 5 months ago
  72. 67cd4eb [vm] Pre-commit VM op speculation tests. NFC. (#11262) by Jakub Kuderski · 2 years, 5 months ago
  73. 4c0554a Minor fixes to enable compilation of e2e matmul tests. (#11258) by MaheshRavishankar · 2 years, 5 months ago
  74. 28457be NFC: Minor refactoring of pass in Flow to separate optional pre-processing passes. (#11260) by MaheshRavishankar · 2 years, 5 months ago
  75. d41e81d typo by Guray Ozen · 2 years, 5 months ago
  76. c57fbc7 Fix triplication Add comments to vecadd2's transform dialect codegen spec by Guray Ozen · 2 years, 5 months ago
  77. d93c263 Integrate llvm-project at 119cef40d18c and bump dependencies. (#11245) by Kojo Acquah · 2 years, 5 months ago
  78. 404e306 Use upstream MLIR's getMappingId method by Guray Ozen · 2 years, 5 months ago
  79. 8941a06 Fix setting of workgroup size for `tile_to_foreach_thread_and_workgroup_count_region` by Guray Ozen · 2 years, 5 months ago
  80. bf1b081 Flatten uniform buffers to static 1-D MemRefs (#11248) by Lei Zhang · 2 years, 5 months ago
  81. 8e861c7 [spirv] Add lowering config verifier for CooperativeMatrix pipeline (#11226) by yzhang93 · 2 years, 5 months ago
  82. a3af902 Address use-def violations arising due to index computation when rewriting tensor.load/tensor.store ops. (#11224) by MaheshRavishankar · 2 years, 5 months ago
  83. 38c1dec Decouple `GPUThreadMappingAttr` from GPU transform dialect by Guray Ozen · 2 years, 5 months ago
  84. 078d1de Set vectorSize back to 4 for non-divisible by 4 dimensions (#11210) by Guray Ozen · 2 years, 5 months ago
  85. 1b95a2f [NFC] Generalize reduction vector size calculation (#11213) by Thomas · 2 years, 5 months ago
  86. ed20752 [GPU] Prevent race condition when writing to shared memory in reduction (#11214) by Thomas · 2 years, 5 months ago
  87. 8aa999d Revert adding string attribute for the transform strategy name. (#11212) by Thomas · 2 years, 5 months ago
  88. 3e2c4f3 Set vectorSize back to 4 for non-divisible by 4 dimensions by Guray Ozen · 2 years, 5 months ago
  89. 20048f7 Convert HAL::DescriptorTypeAttr to proper attributes (#11202) by Lei Zhang · 2 years, 5 months ago
  90. 952aeca Reworking `stream.async.dispatch` to allow for in-place allocation/concurrency scheduling. (#11173) by Ben Vanik · 2 years, 5 months ago
  91. 5703ac8 Adding skeleton subresource hazard detection for concurrency scheduling. by Ben Vanik · 2 years, 6 months ago
  92. 7804ebb Adding `iree-stream-emplace-allocations` pass. by Ben Vanik · 2 years, 6 months ago
  93. 7b2e4ef Add kernel config matcher for transform dialect reduction (#11180) by Thomas · 2 years, 5 months ago
  94. 76afaf2 Cherry pick D134306 and D138085 (#10853) by MaheshRavishankar · 2 years, 5 months ago
  95. 681e627 Adding resource ranges to stream.async.dispatch. by Ben Vanik · 2 years, 6 months ago
  96. 4f5e9ac Running cleanup between usage refinement and scheduling. by Ben Vanik · 2 years, 6 months ago
  97. 3ac9c6f Improving -debug performance of stream passes. by Ben Vanik · 2 years, 6 months ago
  98. 36df628 Sinking stream.async.alloca to consumers like stream.async.splat. by Ben Vanik · 2 years, 6 months ago
  99. 077168f [LLVMCPU] Enable vectorization for pack ops. (#11178) by Han-Chung Wang · 2 years, 5 months ago
  100. 541d1f3 Rename iree-llvmcpu-lower-executable-target -> iree-llvmcpu-lower-exe… (#11150) by Nicolas Vasilache · 2 years, 5 months ago