1. 6ff9a3d Refactor how llvm-cpu check tests interface with ASan/TSan. (#16452) by Scott Todd · 1 year, 2 months ago
  2. 7e1ebd8 [CodeGen] Move Linalg patterns and filters from LinalgExt to Codegen/ (#16619) by Han-Chung Wang · 1 year, 2 months ago
  3. 7e1468c Delete the ParameterStruct calling convention (#16542) by Benoit Jacob · 1 year, 2 months ago
  4. aea4dde [CPU] Introduce dummy cache-level tiling in mmt4d pipeline (#16578) by Diego Caballero · 1 year, 2 months ago
  5. e6397cb Change ukernels calling convention to default (#16541) by Benoit Jacob · 1 year, 2 months ago
  6. e991798 Unroll fixed-trip-count loops within mmt4d ukernel tile functions. (#16626) by Benoit Jacob · 1 year, 2 months ago
  7. 96a09d9 Delete experimental/cpu_ukernel (#16540) by Benoit Jacob · 1 year, 2 months ago
  8. 03bc749 [Flow] Improve flow-break/trace match failure debug output (#16625) by Quinn Dawkins · 1 year, 2 months ago
  9. 9eff861 Fix `iree-import-onnx` operation/module usage. (#16622) by Scott Todd · 1 year, 2 months ago
  10. 4c97ab6 [TransformExtentions][NFC] Retire SimplePatternRewriter (#16620) by Han-Chung Wang · 1 year, 2 months ago
  11. b08f430 Bump StableHLO to 0264c4d64c82ae74a54b85d274eec5084c2c0abf (#16561) by Julian Walker · 1 year, 2 months ago
  12. 890b070 Forking off device methods from TargetBackend->TargetDevice. (#16591) by Ben Vanik · 1 year, 2 months ago
  13. b2250d8 [rocdl] Adjust heuristic seeds for matmuls (#16590) by Kunwar Grover · 1 year, 2 months ago
  14. 8237d9a [Flow] Avoid fusion of dequantization-like ops with producers (#16610) by Quinn Dawkins · 1 year, 2 months ago
  15. 24bf0ac [hip] Optionally enable graph command buffer and tests (#16604) by Lei Zhang · 1 year, 2 months ago
  16. 42232b5 Integrate llvm/llvm-project@4df364bc93af (#16609) by Han-Chung Wang · 1 year, 2 months ago
  17. db2db0a [LinalgExt][NFC] Using underscore as separate symbol for MLIR files. (#16614) by Han-Chung Wang · 1 year, 2 months ago
  18. df784ce Update CODEOWNERS for LinalgExt dialect. (#16612) by Han-Chung Wang · 1 year, 2 months ago
  19. 32113b0 [LinalgExt] Re-implement split reduction with walk-based manner. (#16594) by Han-Chung Wang · 1 year, 2 months ago
  20. 76cbaac Update bazel to 6.5.0 (#16603) by Jerry Wu · 1 year, 2 months ago
  21. 021b41c [Codegen][GPU] Fix multi-dim warp reduction (#16602) by Quinn Dawkins · 1 year, 2 months ago
  22. 42f1675 Run external test suite tests in pkgci. (#16589) by Scott Todd · 1 year, 2 months ago
  23. da1e547 Integrate llvm/llvm-project@80cff273906b (#16597) by Quinn Dawkins · 1 year, 2 months ago
  24. 88b1d4d Replace std::iterator with our custom iterator typedefs (#16423) (#16583) by Peyman Barazandeh · 1 year, 2 months ago
  25. 0c2552f [CI] Run ArmSME tests under emulator as part of `build_test_all_arm64` (#16331) by Benjamin Maxwell · 1 year, 2 months ago
  26. d7de68a [matmul] Add transpose B matrix coverage for CDNA3 (#16558) by Lei Zhang · 1 year, 2 months ago
  27. 09deadf [rocdl] Register some MI210 (gfx90a) supported mfma cases (#16592) by Lei Zhang · 1 year, 2 months ago
  28. 4b1a4e2 Typing IREE::HAL::DeviceTargetAttr executable targets. (#16588) by Ben Vanik · 1 year, 2 months ago
  29. a0febbe [LinalgExt] Delete ForallOpToAsyncRewriter declaration. (#16587) by Han-Chung Wang · 1 year, 2 months ago
  30. eeda5ca Renaming WebGPU to WebGPU-SPIRV (ala Metal-SPIRV). (#16586) by Ben Vanik · 1 year, 2 months ago
  31. adeb538 [Flow] Allow element-wise fusion of multi-reduction ops (#16503) by Max191 · 1 year, 2 months ago
  32. 01c4c57 [CPU] Add a specialized pipeline for LinalgExt::AttentionOp. (#16577) by Han-Chung Wang · 1 year, 2 months ago
  33. 6b995b9 [Codegen][ROCDL] Extend mfma pipeline to support a few more matmul variants (#16582) by Quinn Dawkins · 1 year, 2 months ago
  34. 09eaac0 [Flow] Loosen restrictions on body ops of dequant-like ops (#16449) by Max191 · 1 year, 2 months ago
  35. db677a8 [Codegen][ROCDL] Add support for nhwc convolution with mfma (#16579) by Quinn Dawkins · 1 year, 2 months ago
  36. 9dc8ae4 [cuda][hip] Fix launch host func and worker thread state update (#16568) by Lei Zhang · 1 year, 2 months ago
  37. baeffa7 [Codegen][GPU] Add pass to generalize named convolution ops (#16575) by Quinn Dawkins · 1 year, 2 months ago
  38. bb68472 Drop double print from translate executables pipeline failures. (#16576) by Scott Todd · 1 year, 2 months ago
  39. 66246a3 Fix Python build on Windows after Transform dialect API change. (#16574) by Scott Todd · 1 year, 2 months ago
  40. c730000 [ROCM] Use translation info to store waves-per-eu (#16573) by Quinn Dawkins · 1 year, 2 months ago
  41. 9be693e Run all CI jobs on LLVM integrate PRs. (#16492) by Scott Todd · 1 year, 2 months ago
  42. 7e1e7b0 Integrate llvm/llvm-project@c2042c3bc823 (#16567) by Quinn Dawkins · 1 year, 2 months ago
  43. 000a233 [Codegen] Register AMDGPU transform ops to transform interpretor (#16570) by Kunwar Grover · 1 year, 2 months ago
  44. 37d60f1 [hip] Enable stablehlo/tosa op e2e tests (#16466) by Lei Zhang · 1 year, 2 months ago
  45. 1489584 Adding iree-benchmark-executable help text for CUDA. by Ben Vanik · 1 year, 2 months ago
  46. 6596531 Reland "[LLVMGPU] Add basic lowering pipeline without tiling and distribution" (#16566) by Lei Zhang · 1 year, 2 months ago
  47. 862a031 Adding --task_abort_on_failure flag/API. (#16565) by Ben Vanik · 1 year, 2 months ago
  48. e692e65 [gpu] Retain lowering config when generalizing named ops (#16563) by Lei Zhang · 1 year, 2 months ago
  49. b4f31f8 Add `index` as a legal torch type. (#16560) by Stella Laurenzo · 1 year, 2 months ago
  50. b3419bf Revert "[LinalgExt] Do not decompose attention op with manual analysis." (#16559) by Han-Chung Wang · 1 year, 2 months ago
  51. 23f2828 Adding iree-benchmark-executable tool. (#16550) by Ben Vanik · 1 year, 2 months ago
  52. 4a613b6 NFC: Unify registration of Util interfaces on external ops (#16553) by Quinn Dawkins · 1 year, 2 months ago
  53. 0c8547e Making MaterializeInterfaces anchor on dispatch site device targets. (#16536) by Ben Vanik · 1 year, 2 months ago
  54. 0d5bf27 Refresh stale compiler/README.md. (#16555) by Scott Todd · 1 year, 2 months ago
  55. c087bfb Add support for using PDL to replicate the functionality in MLP sample that uses Transform dialect. (#16453) by MaheshRavishankar · 1 year, 2 months ago
  56. 11089e8 [Codegen] Add missing barrier to GPUVectorAlloc (#16551) by Quinn Dawkins · 1 year, 2 months ago
  57. 8ec45ca [rocdl] Fix mfma accumulator base vector type (#16549) by Lei Zhang · 1 year, 2 months ago
  58. c15b610 [EmitC] Remove the forked emitter and generate all the code in the conversion pass (#16357) by Simon Camphausen · 1 year, 2 months ago
  59. 1d7fb8e Fixing implicit double -> float conversion warnings. (#16547) by Ben Vanik · 1 year, 2 months ago
  60. e9e2d7d CPU] i4 DT enablement post-commit feedback (#16545) by Diego Caballero · 1 year, 2 months ago
  61. cf3903c [rocdl] Add e2e matmul test for cdna3 matrix core (#16510) by Lei Zhang · 1 year, 2 months ago
  62. d500494 Add s8s4s32 dotprod microkernel (#16473) by mariecwhite · 1 year, 2 months ago
  63. fb1151a [Codegen][GPU] Add support for distributing broadcasts with nested layouts (#16532) by Quinn Dawkins · 1 year, 2 months ago
  64. 7b8fa75 [torch] Implement async and mutability programming model. (#16486) by Stella Laurenzo · 1 year, 2 months ago
  65. 4f6390c Revert "[LLVMGPU] Add basic lowering pipeline without tiling and distribution" (#16543) by jinchen · 1 year, 2 months ago
  66. 8b477dc [LinalgExt] Drop unused LinalgExt transform ops. (#16537) by Han-Chung Wang · 1 year, 2 months ago
  67. d4b6d74 [VectorExt] Add support for projecting nested layouts (#16528) by Quinn Dawkins · 1 year, 2 months ago
  68. 0353d12 [rocdl] Allow upcast accumulator to use matrix core (#16527) by Lei Zhang · 1 year, 2 months ago
  69. 7881ed9 [LLVMGPU] Add basic lowering pipeline without tiling and distribution (#16500) by jinchen · 1 year, 2 months ago
  70. 599a3d1 [CPU] Enable DT for [i8, i4 -> i32/f32] mmt4d (#16302) by Diego Caballero · 1 year, 2 months ago
  71. 4182c40 [rocdl] Use a full slice of vector values to avoid anchors (#16534) by Quinn Dawkins · 1 year, 2 months ago
  72. e9b3b33 [CPU][NFC] Fix check rules in materialize encoding test (#16530) by Diego Caballero · 1 year, 2 months ago
  73. 2fe2975 Collapse LinalgExt into the main source tree (#16407) by Han-Chung Wang · 1 year, 2 months ago
  74. 135e34f Moving MaterializeInterfaces' spooky action at a distance around a little. (#16521) by Ben Vanik · 1 year, 2 months ago
  75. 5572254 Upgrade Github runner to v2.313 (#16531) by Jerry Wu · 1 year, 2 months ago
  76. 5d8907e Improve controls inside iree_e2e_matmul_test (#16526) by Lei Zhang · 1 year, 2 months ago
  77. 946375c [LinalgExt] Do not decompose attention op with manual analysis. (#16525) by Han-Chung Wang · 1 year, 2 months ago
  78. 783adb5 [StableHLO] Add SelectOp to GenericTypeConvert (#16523) by Balaji V. Iyer · 1 year, 2 months ago
  79. 588f580 Adding `iree-hal-prune-executables` pass. (#16517) by Ben Vanik · 1 year, 2 months ago
  80. 37f92d2 [CPU] Register a bufferization pipeline. (#16524) by Han-Chung Wang · 1 year, 2 months ago
  81. 91a6bf7 [VectorExt] Add custom parser/printer to elide identity orderings (#16522) by Quinn Dawkins · 1 year, 2 months ago
  82. 566d6c2 [Preprocessing] Add pre-configured pass pipeline for conv transpose (#16520) by Quinn Dawkins · 1 year, 2 months ago
  83. 40b9a66 [GlobalOpt] Add raising pattern for float extensions into certain named op (#16512) by Quinn Dawkins · 1 year, 2 months ago
  84. 8b0651f [Codegen][GPU] Fix id calculation for nested layouts (#16516) by Quinn Dawkins · 1 year, 2 months ago
  85. 908ef84 [Preprocessing] Add a pass to convert convolutions to channels last (#16446) by Quinn Dawkins · 1 year, 2 months ago
  86. 560563f Update excluded test names for Windows failures. (#16514) by Scott Todd · 1 year, 2 months ago
  87. 8b2605e Add `GlobalLoopInvariantCodeMotionPass` to hoist constant `tensor.pack` from loops (#16362) by Jerry Wu · 1 year, 2 months ago
  88. a5aa1b4 Reworking HAL executable lookup/ordinal resolution. (#16508) by Ben Vanik · 1 year, 2 months ago
  89. 4237053 NFC: Make e2e matmul test names consistent (#16511) by Lei Zhang · 1 year, 2 months ago
  90. 4a80ee3 [LLVMGPU] Nuke logic that is trying to simplify thread id arithmetic (#16507) by Quinn Dawkins · 1 year, 2 months ago
  91. ede9135 [rocdl] Fix kernel config when there is no valid intrinsic (#16509) by Quinn Dawkins · 1 year, 2 months ago
  92. 4ee81fa [rocdl] Permute to create nested layout matching contraction (#16496) by Lei Zhang · 1 year, 2 months ago
  93. 5cdd6ec Limit release index API requests to the first 1000 releases. (#16506) by Scott Todd · 1 year, 2 months ago
  94. 884d2dc [HIP] Add device cast to fix build error (#16505) by Nithin Meganathan · 1 year, 2 months ago
  95. c3b3d96 Adding hal.device.id queries to HAL devices. (#16495) by Ben Vanik · 1 year, 2 months ago
  96. dbb43c8 [Codegen] Avoid setting anchors for reads used directly by contractions (#16499) by Quinn Dawkins · 1 year, 2 months ago
  97. c2afb6e [ROCM] Add supported intrinsics for gfx942 (#16498) by Quinn Dawkins · 1 year, 2 months ago
  98. fadc018 Removing the use of the legacy_sync hack from all but ROCM. (#16493) by Ben Vanik · 1 year, 2 months ago
  99. 5eebb91 Removing unused hal.descriptor_set_layout.lookup op. (#16494) by Ben Vanik · 1 year, 2 months ago
  100. 218a5e6 Added support for i4 Const-eval for Tensors (#16321) by Balaji V. Iyer · 1 year, 2 months ago