- 6ff9a3d Refactor how llvm-cpu check tests interface with ASan/TSan. (#16452) by Scott Todd · 1 year, 2 months ago
- 7e1ebd8 [CodeGen] Move Linalg patterns and filters from LinalgExt to Codegen/ (#16619) by Han-Chung Wang · 1 year, 2 months ago
- 7e1468c Delete the ParameterStruct calling convention (#16542) by Benoit Jacob · 1 year, 2 months ago
- aea4dde [CPU] Introduce dummy cache-level tiling in mmt4d pipeline (#16578) by Diego Caballero · 1 year, 2 months ago
- e6397cb Change ukernels calling convention to default (#16541) by Benoit Jacob · 1 year, 2 months ago
- e991798 Unroll fixed-trip-count loops within mmt4d ukernel tile functions. (#16626) by Benoit Jacob · 1 year, 2 months ago
- 96a09d9 Delete experimental/cpu_ukernel (#16540) by Benoit Jacob · 1 year, 2 months ago
- 03bc749 [Flow] Improve flow-break/trace match failure debug output (#16625) by Quinn Dawkins · 1 year, 2 months ago
- 9eff861 Fix `iree-import-onnx` operation/module usage. (#16622) by Scott Todd · 1 year, 2 months ago
- 4c97ab6 [TransformExtentions][NFC] Retire SimplePatternRewriter (#16620) by Han-Chung Wang · 1 year, 2 months ago
- b08f430 Bump StableHLO to 0264c4d64c82ae74a54b85d274eec5084c2c0abf (#16561) by Julian Walker · 1 year, 2 months ago
- 890b070 Forking off device methods from TargetBackend->TargetDevice. (#16591) by Ben Vanik · 1 year, 2 months ago
- b2250d8 [rocdl] Adjust heuristic seeds for matmuls (#16590) by Kunwar Grover · 1 year, 2 months ago
- 8237d9a [Flow] Avoid fusion of dequantization-like ops with producers (#16610) by Quinn Dawkins · 1 year, 2 months ago
- 24bf0ac [hip] Optionally enable graph command buffer and tests (#16604) by Lei Zhang · 1 year, 2 months ago
- 42232b5 Integrate llvm/llvm-project@4df364bc93af (#16609) by Han-Chung Wang · 1 year, 2 months ago
- db2db0a [LinalgExt][NFC] Using underscore as separate symbol for MLIR files. (#16614) by Han-Chung Wang · 1 year, 2 months ago
- df784ce Update CODEOWNERS for LinalgExt dialect. (#16612) by Han-Chung Wang · 1 year, 2 months ago
- 32113b0 [LinalgExt] Re-implement split reduction with walk-based manner. (#16594) by Han-Chung Wang · 1 year, 2 months ago
- 76cbaac Update bazel to 6.5.0 (#16603) by Jerry Wu · 1 year, 2 months ago
- 021b41c [Codegen][GPU] Fix multi-dim warp reduction (#16602) by Quinn Dawkins · 1 year, 2 months ago
- 42f1675 Run external test suite tests in pkgci. (#16589) by Scott Todd · 1 year, 2 months ago
- da1e547 Integrate llvm/llvm-project@80cff273906b (#16597) by Quinn Dawkins · 1 year, 2 months ago
- 88b1d4d Replace std::iterator with our custom iterator typedefs (#16423) (#16583) by Peyman Barazandeh · 1 year, 2 months ago
- 0c2552f [CI] Run ArmSME tests under emulator as part of `build_test_all_arm64` (#16331) by Benjamin Maxwell · 1 year, 2 months ago
- d7de68a [matmul] Add transpose B matrix coverage for CDNA3 (#16558) by Lei Zhang · 1 year, 2 months ago
- 09deadf [rocdl] Register some MI210 (gfx90a) supported mfma cases (#16592) by Lei Zhang · 1 year, 2 months ago
- 4b1a4e2 Typing IREE::HAL::DeviceTargetAttr executable targets. (#16588) by Ben Vanik · 1 year, 2 months ago
- a0febbe [LinalgExt] Delete ForallOpToAsyncRewriter declaration. (#16587) by Han-Chung Wang · 1 year, 2 months ago
- eeda5ca Renaming WebGPU to WebGPU-SPIRV (ala Metal-SPIRV). (#16586) by Ben Vanik · 1 year, 2 months ago
- adeb538 [Flow] Allow element-wise fusion of multi-reduction ops (#16503) by Max191 · 1 year, 2 months ago
- 01c4c57 [CPU] Add a specialized pipeline for LinalgExt::AttentionOp. (#16577) by Han-Chung Wang · 1 year, 2 months ago
- 6b995b9 [Codegen][ROCDL] Extend mfma pipeline to support a few more matmul variants (#16582) by Quinn Dawkins · 1 year, 2 months ago
- 09eaac0 [Flow] Loosen restrictions on body ops of dequant-like ops (#16449) by Max191 · 1 year, 2 months ago
- db677a8 [Codegen][ROCDL] Add support for nhwc convolution with mfma (#16579) by Quinn Dawkins · 1 year, 2 months ago
- 9dc8ae4 [cuda][hip] Fix launch host func and worker thread state update (#16568) by Lei Zhang · 1 year, 2 months ago
- baeffa7 [Codegen][GPU] Add pass to generalize named convolution ops (#16575) by Quinn Dawkins · 1 year, 2 months ago
- bb68472 Drop double print from translate executables pipeline failures. (#16576) by Scott Todd · 1 year, 2 months ago
- 66246a3 Fix Python build on Windows after Transform dialect API change. (#16574) by Scott Todd · 1 year, 2 months ago
- c730000 [ROCM] Use translation info to store waves-per-eu (#16573) by Quinn Dawkins · 1 year, 2 months ago
- 9be693e Run all CI jobs on LLVM integrate PRs. (#16492) by Scott Todd · 1 year, 2 months ago
- 7e1e7b0 Integrate llvm/llvm-project@c2042c3bc823 (#16567) by Quinn Dawkins · 1 year, 2 months ago
- 000a233 [Codegen] Register AMDGPU transform ops to transform interpretor (#16570) by Kunwar Grover · 1 year, 2 months ago
- 37d60f1 [hip] Enable stablehlo/tosa op e2e tests (#16466) by Lei Zhang · 1 year, 2 months ago
- 1489584 Adding iree-benchmark-executable help text for CUDA. by Ben Vanik · 1 year, 2 months ago
- 6596531 Reland "[LLVMGPU] Add basic lowering pipeline without tiling and distribution" (#16566) by Lei Zhang · 1 year, 2 months ago
- 862a031 Adding --task_abort_on_failure flag/API. (#16565) by Ben Vanik · 1 year, 2 months ago
- e692e65 [gpu] Retain lowering config when generalizing named ops (#16563) by Lei Zhang · 1 year, 2 months ago
- b4f31f8 Add `index` as a legal torch type. (#16560) by Stella Laurenzo · 1 year, 2 months ago
- b3419bf Revert "[LinalgExt] Do not decompose attention op with manual analysis." (#16559) by Han-Chung Wang · 1 year, 2 months ago
- 23f2828 Adding iree-benchmark-executable tool. (#16550) by Ben Vanik · 1 year, 2 months ago
- 4a613b6 NFC: Unify registration of Util interfaces on external ops (#16553) by Quinn Dawkins · 1 year, 2 months ago
- 0c8547e Making MaterializeInterfaces anchor on dispatch site device targets. (#16536) by Ben Vanik · 1 year, 2 months ago
- 0d5bf27 Refresh stale compiler/README.md. (#16555) by Scott Todd · 1 year, 2 months ago
- c087bfb Add support for using PDL to replicate the functionality in MLP sample that uses Transform dialect. (#16453) by MaheshRavishankar · 1 year, 2 months ago
- 11089e8 [Codegen] Add missing barrier to GPUVectorAlloc (#16551) by Quinn Dawkins · 1 year, 2 months ago
- 8ec45ca [rocdl] Fix mfma accumulator base vector type (#16549) by Lei Zhang · 1 year, 2 months ago
- c15b610 [EmitC] Remove the forked emitter and generate all the code in the conversion pass (#16357) by Simon Camphausen · 1 year, 2 months ago
- 1d7fb8e Fixing implicit double -> float conversion warnings. (#16547) by Ben Vanik · 1 year, 2 months ago
- e9e2d7d CPU] i4 DT enablement post-commit feedback (#16545) by Diego Caballero · 1 year, 2 months ago
- cf3903c [rocdl] Add e2e matmul test for cdna3 matrix core (#16510) by Lei Zhang · 1 year, 2 months ago
- d500494 Add s8s4s32 dotprod microkernel (#16473) by mariecwhite · 1 year, 2 months ago
- fb1151a [Codegen][GPU] Add support for distributing broadcasts with nested layouts (#16532) by Quinn Dawkins · 1 year, 2 months ago
- 7b8fa75 [torch] Implement async and mutability programming model. (#16486) by Stella Laurenzo · 1 year, 2 months ago
- 4f6390c Revert "[LLVMGPU] Add basic lowering pipeline without tiling and distribution" (#16543) by jinchen · 1 year, 2 months ago
- 8b477dc [LinalgExt] Drop unused LinalgExt transform ops. (#16537) by Han-Chung Wang · 1 year, 2 months ago
- d4b6d74 [VectorExt] Add support for projecting nested layouts (#16528) by Quinn Dawkins · 1 year, 2 months ago
- 0353d12 [rocdl] Allow upcast accumulator to use matrix core (#16527) by Lei Zhang · 1 year, 2 months ago
- 7881ed9 [LLVMGPU] Add basic lowering pipeline without tiling and distribution (#16500) by jinchen · 1 year, 2 months ago
- 599a3d1 [CPU] Enable DT for [i8, i4 -> i32/f32] mmt4d (#16302) by Diego Caballero · 1 year, 2 months ago
- 4182c40 [rocdl] Use a full slice of vector values to avoid anchors (#16534) by Quinn Dawkins · 1 year, 2 months ago
- e9b3b33 [CPU][NFC] Fix check rules in materialize encoding test (#16530) by Diego Caballero · 1 year, 2 months ago
- 2fe2975 Collapse LinalgExt into the main source tree (#16407) by Han-Chung Wang · 1 year, 2 months ago
- 135e34f Moving MaterializeInterfaces' spooky action at a distance around a little. (#16521) by Ben Vanik · 1 year, 2 months ago
- 5572254 Upgrade Github runner to v2.313 (#16531) by Jerry Wu · 1 year, 2 months ago
- 5d8907e Improve controls inside iree_e2e_matmul_test (#16526) by Lei Zhang · 1 year, 2 months ago
- 946375c [LinalgExt] Do not decompose attention op with manual analysis. (#16525) by Han-Chung Wang · 1 year, 2 months ago
- 783adb5 [StableHLO] Add SelectOp to GenericTypeConvert (#16523) by Balaji V. Iyer · 1 year, 2 months ago
- 588f580 Adding `iree-hal-prune-executables` pass. (#16517) by Ben Vanik · 1 year, 2 months ago
- 37f92d2 [CPU] Register a bufferization pipeline. (#16524) by Han-Chung Wang · 1 year, 2 months ago
- 91a6bf7 [VectorExt] Add custom parser/printer to elide identity orderings (#16522) by Quinn Dawkins · 1 year, 2 months ago
- 566d6c2 [Preprocessing] Add pre-configured pass pipeline for conv transpose (#16520) by Quinn Dawkins · 1 year, 2 months ago
- 40b9a66 [GlobalOpt] Add raising pattern for float extensions into certain named op (#16512) by Quinn Dawkins · 1 year, 2 months ago
- 8b0651f [Codegen][GPU] Fix id calculation for nested layouts (#16516) by Quinn Dawkins · 1 year, 2 months ago
- 908ef84 [Preprocessing] Add a pass to convert convolutions to channels last (#16446) by Quinn Dawkins · 1 year, 2 months ago
- 560563f Update excluded test names for Windows failures. (#16514) by Scott Todd · 1 year, 2 months ago
- 8b2605e Add `GlobalLoopInvariantCodeMotionPass` to hoist constant `tensor.pack` from loops (#16362) by Jerry Wu · 1 year, 2 months ago
- a5aa1b4 Reworking HAL executable lookup/ordinal resolution. (#16508) by Ben Vanik · 1 year, 2 months ago
- 4237053 NFC: Make e2e matmul test names consistent (#16511) by Lei Zhang · 1 year, 2 months ago
- 4a80ee3 [LLVMGPU] Nuke logic that is trying to simplify thread id arithmetic (#16507) by Quinn Dawkins · 1 year, 2 months ago
- ede9135 [rocdl] Fix kernel config when there is no valid intrinsic (#16509) by Quinn Dawkins · 1 year, 2 months ago
- 4ee81fa [rocdl] Permute to create nested layout matching contraction (#16496) by Lei Zhang · 1 year, 2 months ago
- 5cdd6ec Limit release index API requests to the first 1000 releases. (#16506) by Scott Todd · 1 year, 2 months ago
- 884d2dc [HIP] Add device cast to fix build error (#16505) by Nithin Meganathan · 1 year, 2 months ago
- c3b3d96 Adding hal.device.id queries to HAL devices. (#16495) by Ben Vanik · 1 year, 2 months ago
- dbb43c8 [Codegen] Avoid setting anchors for reads used directly by contractions (#16499) by Quinn Dawkins · 1 year, 2 months ago
- c2afb6e [ROCM] Add supported intrinsics for gfx942 (#16498) by Quinn Dawkins · 1 year, 2 months ago
- fadc018 Removing the use of the legacy_sync hack from all but ROCM. (#16493) by Ben Vanik · 1 year, 2 months ago
- 5eebb91 Removing unused hal.descriptor_set_layout.lookup op. (#16494) by Ben Vanik · 1 year, 2 months ago
- 218a5e6 Added support for i4 Const-eval for Tensors (#16321) by Balaji V. Iyer · 1 year, 2 months ago