1. 46326ef Integrate LLVM at llvm/llvm-project@abfac56 (#16710) by Jakub Kuderski · 1 year, 1 month ago
  2. 4691fc5 Use subgroup size when doing shuffles (#16698) by harsh-nod · 1 year, 1 month ago
  3. 7da8af6 Cleanup a few docs/references to old paths in HAL/Target. (#16716) by Scott Todd · 1 year, 1 month ago
  4. a283044 [hip] Make graph command buffer as default for initialization (#16707) by Lei Zhang · 1 year, 1 month ago
  5. de02c1d Disable external test suite on ROCm while flaky. (#16705) by Scott Todd · 1 year, 1 month ago
  6. c2a3245 Convert LLVMCPU compiler target to a plugin. (#16704) by Scott Todd · 1 year, 1 month ago
  7. b5af996 Prefer broadcasting RHS over LHS in AVX-512 multiply-accumulate instructions (#16709) by Benoit Jacob · 1 year, 1 month ago
  8. 47dfaa5 If negative tolerance in numerical test, then emit error (#16694) by James Newling · 1 year, 1 month ago
  9. 7884dc8 Revert adding unit dim folding to GlobalOps (#16708) by Max191 · 1 year, 1 month ago
  10. f513fe2 Add timeouts to pytest cases in pkgci/iree_tests. (#16703) by Scott Todd · 1 year, 1 month ago
  11. 0545746 [hip] Mark device local + host visible as low performance (#16701) by Lei Zhang · 1 year, 1 month ago
  12. b027da4 Convert VulkanSPIRV compiler target into a plugin. (#16699) by Scott Todd · 1 year, 1 month ago
  13. a86b8bf [Preprocessing] Change nesting of FoldUnitExtentDims (#16697) by Max191 · 1 year, 1 month ago
  14. 2f1d32d [linalg] Add the lowering of quantized_batch_matmul op. (#16615) by Prashant Kumar · 1 year, 1 month ago
  15. 6d03805 [CPU] Centralize pipeline lowering options and apply them consistently. (#16690) by Han-Chung Wang · 1 year, 1 month ago
  16. c344e26 Cleanup compiler plugin directory and include paths. (#16691) by Scott Todd · 1 year, 1 month ago
  17. c87eafe Update external test suite version pin and XFAIL sets. (#16675) by Scott Todd · 1 year, 1 month ago
  18. bb9409f [VectorDistribution] Emit diagnostics for invalid layouts (#16688) by Jakub Kuderski · 1 year, 1 month ago
  19. c07d110 [GlobalOpt][Flow] Add GlobalOp folding to FoldUnitExtentDims (#16611) by Max191 · 1 year, 1 month ago
  20. e612e91 Use createOrFold for linalg_ext dim queries. (#16685) by Ben Vanik · 1 year, 1 month ago
  21. f812ce2 Drop lists of VulkanSPIRV flags. (#16680) by Scott Todd · 1 year, 1 month ago
  22. 3bdb45b Use correct 'webgpu-spirv' flag name in samples. (#16681) by Scott Todd · 1 year, 1 month ago
  23. 890ae99 Making the ElideAsyncCopiesPass support stream.async.slice. (#16667) by Ben Vanik · 1 year, 1 month ago
  24. f34e534 Replace k with m by mariecwhite · 1 year, 2 months ago
  25. 4d3c93f Add missing macros to dotprod ukernel by mariecwhite · 1 year, 2 months ago
  26. 7782a41 [Codegen][GPU] NFC: Move SPIRVCreateFastSlowPath to Common/GPU (#16669) by Quinn Dawkins · 1 year, 2 months ago
  27. 94f64fa Integrate llvm/llvm-project@c54064de80e93494d1d44550b56ce8f2f3cf9c4b (#16652) by Max191 · 1 year, 2 months ago
  28. 9ce2044 Always upload benchmark artifacts even if the job failed (#16668) by Jerry Wu · 1 year, 2 months ago
  29. 21c67e2 bump torch-mlir to e48fe4588631e7a37a2899f9d4cd5c4cbc967481 (#16607) by Daniel Garvey · 1 year, 2 months ago
  30. 7171014 [Codegen][ROCDL] Replace custom generalization pass with upstream one (#16662) by Quinn Dawkins · 1 year, 2 months ago
  31. 77758bd [rocm] Port optional symbols support for hipGetDeviceProperties (#16661) by Lei Zhang · 1 year, 2 months ago
  32. 6209806 [ROCm] Set option to preload kernel arguments (#16659) by Jakub Kuderski · 1 year, 2 months ago
  33. 57ac339 [hip] Enable stream command buffer dispatch tracing (#16641) by Lei Zhang · 1 year, 2 months ago
  34. 8adae37 [cuda][hip] Add support for semaphore multi wait (#16638) by Lei Zhang · 1 year, 2 months ago
  35. 95d7465 Refining emplace allocations to handle dynamic shapes. (#16656) by Ben Vanik · 1 year, 2 months ago
  36. 9d6d99f faster narrow mmt4d ukernels on x86 (#16655) by Benoit Jacob · 1 year, 2 months ago
  37. 20ed89a Upgrade Github runner to 2.314.1 (#16658) by Jerry Wu · 1 year, 2 months ago
  38. 382e783 [CPU] Enable vector.interleave lowering (#16562) by Diego Caballero · 1 year, 2 months ago
  39. 4f1f055 mmt4d ukernel: use fewer magic macros to generate tile-functions M0-variants (#16645) by Benoit Jacob · 1 year, 2 months ago
  40. b994b72 Reenable accidentally disabled architecture-specific parts of `mmt4d_test` (#16654) by Benoit Jacob · 1 year, 2 months ago
  41. 01e87b6 [VectorDistribution] Enable layout resolution lowering for LayoutAttr (#16653) by Kunwar Grover · 1 year, 2 months ago
  42. c41e1aa [docs] Document how to use the dynamic ASan runtime (#16613) by Boian Petkantchin · 1 year, 2 months ago
  43. f433fd2 Using iree.abi.name consistently for arg/result names. (#16635) by Ben Vanik · 1 year, 2 months ago
  44. 2ba3d5c [hip] Drop unnecessary __HIP_PLATFORM_HCC__ definition (#16644) by Lei Zhang · 1 year, 2 months ago
  45. eda28bf [hip][rocm] Fix hipGetDeviceProperties usage after ROCm 6.0 (#16643) by Lei Zhang · 1 year, 2 months ago
  46. fe5e69a [cuda][hip] Shorten deferred queue worker name (#16642) by Lei Zhang · 1 year, 2 months ago
  47. 047211b [LLVMGPU] Fix allocation space for thread local allocations (#16640) by Quinn Dawkins · 1 year, 2 months ago
  48. 9dfc612 [cuda][hip] Fix worker thread and device host callback synchronization (#16621) by Boian Petkantchin · 1 year, 2 months ago
  49. 00c9fc2 Add option to toggle CMAKE_POSITION_INDEPENDENT_CODE (#16311) by Gyungmin Myung · 1 year, 2 months ago
  50. f66d7f2 Fix enablement of mmt4d ukernel test cases based on ISA code paths built (#16637) by Benoit Jacob · 1 year, 2 months ago
  51. 5218ef2 [gpu] Improve vector distribution error reporting (#16636) by Lei Zhang · 1 year, 2 months ago
  52. 5180ede mmt4d ukernel: simplification in generic tile funcs: stop using a stack array (#16633) by Benoit Jacob · 1 year, 2 months ago
  53. 62da59b [Codegen][rocdl] Generalize all contraction ops before folding unit dims (#16632) by Quinn Dawkins · 1 year, 2 months ago
  54. 8959b90 Make ukernels fallback opt-in and add a `mmt4d_info` ukernel to query the mmt4d implementation. (#16631) by Benoit Jacob · 1 year, 2 months ago
  55. af65386 [GlobalOpt] Propagate transposes through unary elementwise ops (#16623) by Quinn Dawkins · 1 year, 2 months ago
  56. e2d73ec [rocm] Backport and adjust some HIP allocator and buffer changes (#16627) by Lei Zhang · 1 year, 2 months ago
  57. 71f87af [Codegen][GPU] Add support for transpose distribution with nested layouts (#16630) by Quinn Dawkins · 1 year, 2 months ago
  58. 6ff9a3d Refactor how llvm-cpu check tests interface with ASan/TSan. (#16452) by Scott Todd · 1 year, 2 months ago
  59. 7e1ebd8 [CodeGen] Move Linalg patterns and filters from LinalgExt to Codegen/ (#16619) by Han-Chung Wang · 1 year, 2 months ago
  60. 7e1468c Delete the ParameterStruct calling convention (#16542) by Benoit Jacob · 1 year, 2 months ago
  61. aea4dde [CPU] Introduce dummy cache-level tiling in mmt4d pipeline (#16578) by Diego Caballero · 1 year, 2 months ago
  62. e6397cb Change ukernels calling convention to default (#16541) by Benoit Jacob · 1 year, 2 months ago
  63. e991798 Unroll fixed-trip-count loops within mmt4d ukernel tile functions. (#16626) by Benoit Jacob · 1 year, 2 months ago
  64. 96a09d9 Delete experimental/cpu_ukernel (#16540) by Benoit Jacob · 1 year, 2 months ago
  65. 03bc749 [Flow] Improve flow-break/trace match failure debug output (#16625) by Quinn Dawkins · 1 year, 2 months ago
  66. 9eff861 Fix `iree-import-onnx` operation/module usage. (#16622) by Scott Todd · 1 year, 2 months ago
  67. 4c97ab6 [TransformExtentions][NFC] Retire SimplePatternRewriter (#16620) by Han-Chung Wang · 1 year, 2 months ago
  68. b08f430 Bump StableHLO to 0264c4d64c82ae74a54b85d274eec5084c2c0abf (#16561) by Julian Walker · 1 year, 2 months ago
  69. 890b070 Forking off device methods from TargetBackend->TargetDevice. (#16591) by Ben Vanik · 1 year, 2 months ago
  70. b2250d8 [rocdl] Adjust heuristic seeds for matmuls (#16590) by Kunwar Grover · 1 year, 2 months ago
  71. 8237d9a [Flow] Avoid fusion of dequantization-like ops with producers (#16610) by Quinn Dawkins · 1 year, 2 months ago
  72. 24bf0ac [hip] Optionally enable graph command buffer and tests (#16604) by Lei Zhang · 1 year, 2 months ago
  73. 42232b5 Integrate llvm/llvm-project@4df364bc93af (#16609) by Han-Chung Wang · 1 year, 2 months ago
  74. db2db0a [LinalgExt][NFC] Using underscore as separate symbol for MLIR files. (#16614) by Han-Chung Wang · 1 year, 2 months ago
  75. df784ce Update CODEOWNERS for LinalgExt dialect. (#16612) by Han-Chung Wang · 1 year, 2 months ago
  76. 32113b0 [LinalgExt] Re-implement split reduction with walk-based manner. (#16594) by Han-Chung Wang · 1 year, 2 months ago
  77. 76cbaac Update bazel to 6.5.0 (#16603) by Jerry Wu · 1 year, 2 months ago
  78. 021b41c [Codegen][GPU] Fix multi-dim warp reduction (#16602) by Quinn Dawkins · 1 year, 2 months ago
  79. 42f1675 Run external test suite tests in pkgci. (#16589) by Scott Todd · 1 year, 2 months ago
  80. da1e547 Integrate llvm/llvm-project@80cff273906b (#16597) by Quinn Dawkins · 1 year, 2 months ago
  81. 88b1d4d Replace std::iterator with our custom iterator typedefs (#16423) (#16583) by Peyman Barazandeh · 1 year, 2 months ago
  82. 0c2552f [CI] Run ArmSME tests under emulator as part of `build_test_all_arm64` (#16331) by Benjamin Maxwell · 1 year, 2 months ago
  83. d7de68a [matmul] Add transpose B matrix coverage for CDNA3 (#16558) by Lei Zhang · 1 year, 2 months ago
  84. 09deadf [rocdl] Register some MI210 (gfx90a) supported mfma cases (#16592) by Lei Zhang · 1 year, 2 months ago
  85. 4b1a4e2 Typing IREE::HAL::DeviceTargetAttr executable targets. (#16588) by Ben Vanik · 1 year, 2 months ago
  86. a0febbe [LinalgExt] Delete ForallOpToAsyncRewriter declaration. (#16587) by Han-Chung Wang · 1 year, 2 months ago
  87. eeda5ca Renaming WebGPU to WebGPU-SPIRV (ala Metal-SPIRV). (#16586) by Ben Vanik · 1 year, 2 months ago
  88. adeb538 [Flow] Allow element-wise fusion of multi-reduction ops (#16503) by Max191 · 1 year, 2 months ago
  89. 01c4c57 [CPU] Add a specialized pipeline for LinalgExt::AttentionOp. (#16577) by Han-Chung Wang · 1 year, 2 months ago
  90. 6b995b9 [Codegen][ROCDL] Extend mfma pipeline to support a few more matmul variants (#16582) by Quinn Dawkins · 1 year, 2 months ago
  91. 09eaac0 [Flow] Loosen restrictions on body ops of dequant-like ops (#16449) by Max191 · 1 year, 2 months ago
  92. db677a8 [Codegen][ROCDL] Add support for nhwc convolution with mfma (#16579) by Quinn Dawkins · 1 year, 2 months ago
  93. 9dc8ae4 [cuda][hip] Fix launch host func and worker thread state update (#16568) by Lei Zhang · 1 year, 2 months ago
  94. baeffa7 [Codegen][GPU] Add pass to generalize named convolution ops (#16575) by Quinn Dawkins · 1 year, 2 months ago
  95. bb68472 Drop double print from translate executables pipeline failures. (#16576) by Scott Todd · 1 year, 2 months ago
  96. 66246a3 Fix Python build on Windows after Transform dialect API change. (#16574) by Scott Todd · 1 year, 2 months ago
  97. c730000 [ROCM] Use translation info to store waves-per-eu (#16573) by Quinn Dawkins · 1 year, 2 months ago
  98. 9be693e Run all CI jobs on LLVM integrate PRs. (#16492) by Scott Todd · 1 year, 2 months ago
  99. 7e1e7b0 Integrate llvm/llvm-project@c2042c3bc823 (#16567) by Quinn Dawkins · 1 year, 2 months ago
  100. 000a233 [Codegen] Register AMDGPU transform ops to transform interpretor (#16570) by Kunwar Grover · 1 year, 2 months ago