- 9b47492 [metal] Implement descriptor set and pipeline layout APIs by Lei Zhang · 2 years, 2 months ago
- 33df9dd [metal] Use Metal shared event to implement IREE semaphore APIs by Lei Zhang · 2 years, 2 months ago
- f9802f0 [metal] Wire up creating devices, allocators, and buffers by Lei Zhang · 2 years, 3 months ago
- 8a2330f [metal] Implement Metal allocator and buffer APIs by Lei Zhang · 2 years, 3 months ago
- 8f36024 [metal] Add build and registration for a Metal HAL driver by Lei Zhang · 2 years, 3 months ago
- bed2763 Clean up code of legacy benchmark suite (#14081) by Jerry Wu · 1 year, 10 months ago
- 85eb21b [cuda] Port over tracing utilities and use in NCCL channel (#14063) by Lei Zhang · 1 year, 10 months ago
- 85a1a56 [cuda] Port over native executable and its cache (#14062) by Lei Zhang · 1 year, 10 months ago
- 330771e [cuda] Port over channel implementation via NCCL (#14059) by Lei Zhang · 1 year, 10 months ago
- f4a0bdf Integrate llvm-project at 0258a53521cf and bump dependencies (#14065) by Diego Caballero · 1 year, 10 months ago
- 528337e [LLVMCPU] Make LLVMCPUTile prefer using the lowering_config of tiled op. (#13922) by Han-Chung Wang · 1 year, 10 months ago
- afdc6f6 Fixing dynamic complex number splats in stream encoding. (#14100) by Ben Vanik · 1 year, 10 months ago
- dd5aac1 Add dummy Linalg ops to SSVE tests (#14099) by Andrzej Warzyński · 1 year, 10 months ago
- 27448a8 [NFC] Omitting default N=4 from SmallVector and to_vector. (#13933) by Han-Chung Wang · 1 year, 10 months ago
- e1ab9aa [ci] Update fetch_cuda_toolkit.py to use 12.1.1 for releases (#14093) by Lei Zhang · 1 year, 10 months ago
- 074a12c [LLVMCPU] Make LLVMCPUTileAndFuse use consumer's config if possible. (#13920) by Han-Chung Wang · 1 year, 10 months ago
- 239e393 Check `nvidia-bleeding-edge` during image setup (#13961) by Jerry Wu · 1 year, 10 months ago
- 29e8ed0 Remove redundant `requires-gpu-nvidia` added in #14039 (#14076) by Geoffrey Martin-Noble · 1 year, 10 months ago
- bb7a83e Create empty files without unnecessary timestamp updates. (#14097) by bjacob · 1 year, 10 months ago
- 2bb9a54 Add `iree-hal-dump-executable-files-to` meta flag. (#14096) by Scott Todd · 1 year, 10 months ago
- bf09487 Stripping jax/tf/mhlo attributes from stablehlo inputs. (#14092) by Ben Vanik · 1 year, 10 months ago
- bd80c9a Re-fixing 6888e61c45d117f177e5c0050c83f11be9199c8f MSVC break. by Ben Vanik · 1 year, 10 months ago
- c1b41f2 [CI][StableHLO] Auto-advance stablehlo fork (#14087) by Jakub Kuderski · 1 year, 10 months ago
- 6888e61 Fixing MSVC build break from #13971. by Ben Vanik · 1 year, 10 months ago
- 14308b1 Cleaning up the iree/base/tracing.h header a bit. (#14089) by Ben Vanik · 1 year, 10 months ago
- 71fc1f9 Fixing comma-separated lists in iree-run-mlir `--Xcompiler,` args. (#14075) by Ben Vanik · 1 year, 10 months ago
- b128e63 Fixing empty trace replay yaml lists for args/results. (#14088) by Ben Vanik · 1 year, 10 months ago
- e0d36f9 Fuse iota ops with consumers always. (#14070) by MaheshRavishankar · 1 year, 10 months ago
- 389152b Update docs to remove iree folder that no longer exists (#13306) by Tori Baker · 1 year, 10 months ago
- c457c30 [GPUCheckResourceUsage] Don't choke on alloc of memref of index (#14001) by qcolombet · 1 year, 10 months ago
- f9d1525 Harden the MatmulTensorCore strategy (#13971) by Nicolas Vasilache · 1 year, 10 months ago
- 786ab7a Disable bf16-to-f32 promotion pass default (#14078) by Rob Suderman · 1 year, 10 months ago
- 75be6d8 Clone complex.* producers into dispatch regions (#14036) by Rob Suderman · 1 year, 10 months ago
- 106d68b Allowing complex types as dispatch operands. (#14037) by Ben Vanik · 1 year, 10 months ago
- 20186df Shortening TransformDialectStrategies to TransformStrategies. (#14073) by Ben Vanik · 1 year, 10 months ago
- 31c2d24 [cuda] Dump more synchronization related attributes (#14074) by Lei Zhang · 1 year, 10 months ago
- c104444 Fix the ignored list of markdownlint (#14064) by Jerry Wu · 1 year, 10 months ago
- 96d9213 Reword comments for `IREE_BUILD_DOCS`. (#14069) by Scott Todd · 1 year, 10 months ago
- 441ae46 [spirv] Dump spirv.module for Metal and WebGPU targets (#14066) by Lei Zhang · 1 year, 10 months ago
- e94415b [spirv] Dump spirv.module with dump-executable-intermediates-to= (#14058) by Lei Zhang · 1 year, 10 months ago
- 02cac7a [llvmgpu] Target CUDA sm_60 architecture by default (#14057) by Lei Zhang · 1 year, 10 months ago
- 0dbf086 Remove MHLO support (#14008) by Jakub Kuderski · 1 year, 10 months ago
- e24089d [WebGPU] Async, loop-based invoke and output. (#13962) by Scott Todd · 1 year, 10 months ago
- 45a3eb4 [cuda] Port over descriptor set and pipeline layout (#14038) by Lei Zhang · 1 year, 10 months ago
- 2dddf02 Correctly tag Vulkan Ampere tests as requiring sm80 (#14039) by Geoffrey Martin-Noble · 1 year, 10 months ago
- f71aebc [StableHLO] Make reduce lowering more robust (#14046) by Jakub Kuderski · 1 year, 10 months ago
- 23666a6 Add a new benchmark and document steps: Add a new unaligned matmul test that will exercise failsafes to avoid bad configurations (#14052) by Nicolas Vasilache · 1 year, 10 months ago
- 88920b4 Drop some patterns from IREE apply_patterns op (#14053) by Matthias Springer · 1 year, 10 months ago
- 8357ea4 Use upstreamed ApplyPatternsOp in buildCanonicalizationAndEnablingTransforms (#14026) by Matthias Springer · 1 year, 10 months ago
- 807da25 [ConvertToLLVM] Don't choke on alloc of memref of index (#14002) by qcolombet · 1 year, 10 months ago
- cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 11 months ago
- 651630e [StableHLO] Port reduce canon patterns (#14045) by Jakub Kuderski · 1 year, 11 months ago
- fa3a220 Fix bug in TopK pattern matching that sets K as 2nd dim rather than last. (#14041) by NatashaKnk · 1 year, 11 months ago
- c4e01e9 [cuda] Wire up basic creating devices, allocators, and buffers (#14011) by Lei Zhang · 1 year, 11 months ago
- 967ab3b Adding cuda.device :: compute_capability_major/minor queries. (#14033) by Ben Vanik · 1 year, 11 months ago
- 7514d3e Add pattern to map iota->sort->slice to topK (#13972) by NatashaKnk · 1 year, 11 months ago
- e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 11 months ago
- b3a1ac9 Bump LLVM to llvm/llvm-project@faae4d5d (#14023) by Matthias Springer · 1 year, 11 months ago
- 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 11 months ago
- 860026f Declare exported headers to Bazel to fix `bazel build --incompatible_no_implicit_file_export` (#13982) by Levon Ter-Grigoryan · 1 year, 11 months ago
- 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 11 months ago
- c941946 Exposing device configuration higher in the stack. (#14009) by Ben Vanik · 1 year, 11 months ago
- 3bd115a Refresh deployment-configuration website pages. (#14013) by Scott Todd · 1 year, 11 months ago
- c12bf50 Harden the PadOp matcher (#14024) by Nicolas Vasilache · 1 year, 11 months ago
- c620ab2 Clean up CapturingOpMatcher and derived classes, NFC (#14022) by Oleksandr "Alex" Zinenko · 1 year, 11 months ago
- f3d0369 Add codegen strategy for GPU padding (#14000) by Nicolas Vasilache · 1 year, 11 months ago
- 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 11 months ago
- 0e03852 Test with ASAN in bytecode modules (#14005) by bjacob · 1 year, 11 months ago
- 6a87b03 Add tags to website pages. (#14006) by Scott Todd · 1 year, 11 months ago
- 38ae184 _MSC_VER comparison was the wrong way (#14007) by bjacob · 1 year, 11 months ago
- 975ba03 Adds support for mixed precision NVIDIA A100 Tensor Cores (F32 <= F16 * F16 + F32) (#13857) by Manish Gupta · 1 year, 11 months ago
- 6a61d9f [cuda] Dump whether the device has integrated memory (#13986) by Lei Zhang · 1 year, 11 months ago
- ef52dbf Restructure website sections and navigation. (#13991) by Scott Todd · 1 year, 11 months ago
- d9b9471 Retain the parent channel on a split in iree_hal_nccl_channel_t. (#13977) by Ben Vanik · 1 year, 11 months ago
- 8f9e962 Adding support for async memory pool allocations in the CUDA HAL. (#13440) by Ben Vanik · 1 year, 11 months ago
- aced620 Lowering flow.tensor.alloca (renamed) to stream.async.alloca. (#13998) by Ben Vanik · 1 year, 11 months ago
- 4e48c13 Moving builtins lower in the pipeline and adding option to force. (#13994) by Ben Vanik · 1 year, 11 months ago
- 1299742 [cuda] Port over allocator and buffer implementation (#13985) by Lei Zhang · 1 year, 11 months ago
- 0dc5de9 Route demotion flag to Input options (#13993) by Rob Suderman · 1 year, 11 months ago
- 377d27a Integrate llvm-project at https://github.com/llvm/llvm-project/commit/85b77b13e3bcccffeb84b09365e0ab96565467fa (#13975) by MaheshRavishankar · 1 year, 11 months ago
- bf3e1a2 Resetting collective batch when the CUDA command buffer arena is set. (#13978) by Ben Vanik · 1 year, 11 months ago
- 74f6a6a Remove redundant newlines in generated benchmark cmake files (#13979) by Jerry Wu · 1 year, 11 months ago
- 4c1ceb2 Make reduction/matmul/conv matchers optionally partial (#13981) by Oleksandr "Alex" Zinenko · 1 year, 11 months ago
- 4e9b3bd [LLVMCPU] Add pass to enable Armv9 Streaming SVE mode (#13558) by Cullen Rhodes · 1 year, 11 months ago
- 0b91c98 Re-enable bf16 native execution for the StableHLO Path (#13976) by Rob Suderman · 1 year, 11 months ago
- 815e843 Make IREEDialectsTransforms cmake target export includes (#13980) by Oleksandr "Alex" Zinenko · 1 year, 11 months ago
- daabcf7 Unbreak the `byo_llvm.sh` build with `iree_bitcode_library`. (#13968) by bjacob · 1 year, 11 months ago
- 21b41db Create `iree-benchmark-import-models-large` for large benchmarks (#13963) by Jerry Wu · 1 year, 11 months ago
- 5da8c71 Add `iree-stream-resource-alias-mutable-bindings` flag. (#13965) by Scott Todd · 1 year, 11 months ago
- 18a1bac Clean up build_linux_packages.sh to address comments (#13938) by powderluv · 1 year, 11 months ago
- 00dd8a3 Update build_tools/python_deploy/build_linux_packages.sh by powderluv · 1 year, 11 months ago
- 0f641e6 Add experimental WebGPU HAL backend. (#13952) by Scott Todd · 1 year, 11 months ago
- 6d60a12 Add WebGPU sample application and update other web demos. by Scott Todd · 1 year, 11 months ago
- e7c2cba Initial WebGPU HAL implementation. by Ben Vanik · 3 years, 9 months ago
- 6d6f54b Fix tests/transform_dialect/cuda/ dependencies (#13949) by Levon Ter-Grigoryan · 1 year, 11 months ago
- ebea998 [ci] Update CUDA toolkit to v12.1.1 (#13875) by Lei Zhang · 1 year, 11 months ago
- b5e45cd [ci] Update NVIDIA driver packages to v530 in docker images (#13912) by Lei Zhang · 1 year, 11 months ago
- 508247f Add CODEOWNERS for experimental directories (#13956) by Geoffrey Martin-Noble · 1 year, 11 months ago
- 3fa5ad3 [StableHLO] Port Philox rng (#13844) by jvstokes · 1 year, 11 months ago
- 686860c [cuda] Dump useful GPU characteristics (#13955) by Lei Zhang · 1 year, 11 months ago