- 45a3eb4 [cuda] Port over descriptor set and pipeline layout (#14038) by Lei Zhang · 1 year, 10 months ago
- 2dddf02 Correctly tag Vulkan Ampere tests as requiring sm80 (#14039) by Geoffrey Martin-Noble · 1 year, 10 months ago
- f71aebc [StableHLO] Make reduce lowering more robust (#14046) by Jakub Kuderski · 1 year, 10 months ago
- 23666a6 Add a new benchmark and document steps: Add a new unaligned matmul test that will exercise failsafes to avoid bad configurations (#14052) by Nicolas Vasilache · 1 year, 10 months ago
- 88920b4 Drop some patterns from IREE apply_patterns op (#14053) by Matthias Springer · 1 year, 10 months ago
- 8357ea4 Use upstreamed ApplyPatternsOp in buildCanonicalizationAndEnablingTransforms (#14026) by Matthias Springer · 1 year, 10 months ago
- 807da25 [ConvertToLLVM] Don't choke on alloc of memref of index (#14002) by qcolombet · 1 year, 10 months ago
- cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 10 months ago
- 651630e [StableHLO] Port reduce canon patterns (#14045) by Jakub Kuderski · 1 year, 10 months ago
- fa3a220 Fix bug in TopK pattern matching that sets K as 2nd dim rather than last. (#14041) by NatashaKnk · 1 year, 10 months ago
- c4e01e9 [cuda] Wire up basic creating devices, allocators, and buffers (#14011) by Lei Zhang · 1 year, 10 months ago
- 967ab3b Adding cuda.device :: compute_capability_major/minor queries. (#14033) by Ben Vanik · 1 year, 10 months ago
- 7514d3e Add pattern to map iota->sort->slice to topK (#13972) by NatashaKnk · 1 year, 10 months ago
- e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 10 months ago
- b3a1ac9 Bump LLVM to llvm/llvm-project@faae4d5d (#14023) by Matthias Springer · 1 year, 10 months ago
- 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 10 months ago
- 860026f Declare exported headers to Bazel to fix `bazel build --incompatible_no_implicit_file_export` (#13982) by Levon Ter-Grigoryan · 1 year, 10 months ago
- 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 10 months ago
- c941946 Exposing device configuration higher in the stack. (#14009) by Ben Vanik · 1 year, 10 months ago
- 3bd115a Refresh deployment-configuration website pages. (#14013) by Scott Todd · 1 year, 10 months ago
- c12bf50 Harden the PadOp matcher (#14024) by Nicolas Vasilache · 1 year, 10 months ago
- c620ab2 Clean up CapturingOpMatcher and derived classes, NFC (#14022) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
- f3d0369 Add codegen strategy for GPU padding (#14000) by Nicolas Vasilache · 1 year, 10 months ago
- 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 10 months ago
- 0e03852 Test with ASAN in bytecode modules (#14005) by bjacob · 1 year, 10 months ago
- 6a87b03 Add tags to website pages. (#14006) by Scott Todd · 1 year, 10 months ago
- 38ae184 _MSC_VER comparison was the wrong way (#14007) by bjacob · 1 year, 10 months ago
- 975ba03 Adds support for mixed precision NVIDIA A100 Tensor Cores (F32 <= F16 * F16 + F32) (#13857) by Manish Gupta · 1 year, 10 months ago
- 6a61d9f [cuda] Dump whether the device has integrated memory (#13986) by Lei Zhang · 1 year, 10 months ago
- ef52dbf Restructure website sections and navigation. (#13991) by Scott Todd · 1 year, 10 months ago
- d9b9471 Retain the parent channel on a split in iree_hal_nccl_channel_t. (#13977) by Ben Vanik · 1 year, 10 months ago
- 8f9e962 Adding support for async memory pool allocations in the CUDA HAL. (#13440) by Ben Vanik · 1 year, 10 months ago
- aced620 Lowering flow.tensor.alloca (renamed) to stream.async.alloca. (#13998) by Ben Vanik · 1 year, 10 months ago
- 4e48c13 Moving builtins lower in the pipeline and adding option to force. (#13994) by Ben Vanik · 1 year, 10 months ago
- 1299742 [cuda] Port over allocator and buffer implementation (#13985) by Lei Zhang · 1 year, 10 months ago
- 0dc5de9 Route demotion flag to Input options (#13993) by Rob Suderman · 1 year, 10 months ago
- 377d27a Integrate llvm-project at https://github.com/llvm/llvm-project/commit/85b77b13e3bcccffeb84b09365e0ab96565467fa (#13975) by MaheshRavishankar · 1 year, 10 months ago
- bf3e1a2 Resetting collective batch when the CUDA command buffer arena is set. (#13978) by Ben Vanik · 1 year, 10 months ago
- 74f6a6a Remove redundant newlines in generated benchmark cmake files (#13979) by Jerry Wu · 1 year, 10 months ago
- 4c1ceb2 Make reduction/matmul/conv matchers optionally partial (#13981) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
- 4e9b3bd [LLVMCPU] Add pass to enable Armv9 Streaming SVE mode (#13558) by Cullen Rhodes · 1 year, 10 months ago
- 0b91c98 Re-enable bf16 native execution for the StableHLO Path (#13976) by Rob Suderman · 1 year, 10 months ago
- 815e843 Make IREEDialectsTransforms cmake target export includes (#13980) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
- daabcf7 Unbreak the `byo_llvm.sh` build with `iree_bitcode_library`. (#13968) by bjacob · 1 year, 10 months ago
- 21b41db Create `iree-benchmark-import-models-large` for large benchmarks (#13963) by Jerry Wu · 1 year, 10 months ago
- 5da8c71 Add `iree-stream-resource-alias-mutable-bindings` flag. (#13965) by Scott Todd · 1 year, 10 months ago
- 18a1bac Clean up build_linux_packages.sh to address comments (#13938) by powderluv · 1 year, 10 months ago
- 00dd8a3 Update build_tools/python_deploy/build_linux_packages.sh by powderluv · 1 year, 10 months ago
- 0f641e6 Add experimental WebGPU HAL backend. (#13952) by Scott Todd · 1 year, 10 months ago
- 6d60a12 Add WebGPU sample application and update other web demos. by Scott Todd · 1 year, 10 months ago
- e7c2cba Initial WebGPU HAL implementation. by Ben Vanik · 3 years, 9 months ago
- 6d6f54b Fix tests/transform_dialect/cuda/ dependencies (#13949) by Levon Ter-Grigoryan · 1 year, 10 months ago
- ebea998 [ci] Update CUDA toolkit to v12.1.1 (#13875) by Lei Zhang · 1 year, 10 months ago
- b5e45cd [ci] Update NVIDIA driver packages to v530 in docker images (#13912) by Lei Zhang · 1 year, 10 months ago
- 508247f Add CODEOWNERS for experimental directories (#13956) by Geoffrey Martin-Noble · 1 year, 10 months ago
- 3fa5ad3 [StableHLO] Port Philox rng (#13844) by jvstokes · 1 year, 10 months ago
- 686860c [cuda] Dump useful GPU characteristics (#13955) by Lei Zhang · 1 year, 10 months ago
- 6ab4570 [cuda] NFC: Split files for CUDA and NCCL dynamic symbols (#13954) by Lei Zhang · 1 year, 10 months ago
- ad321b6 Rollup of HAL/runtime/infra changes for WebGPU HAL. (#13953) by Scott Todd · 1 year, 10 months ago
- 5c38bcc [cuda] Implement basics for a CUDA HAL driver rewrite (#13942) by Lei Zhang · 1 year, 10 months ago
- 2544efe Rename long-running to large in benchmark suite and workflows (#13914) by Jerry Wu · 1 year, 10 months ago
- fc93e91 Add LinalgExt TypePropagation pattern that handles i1 inputs/outputs (#13936) by NatashaKnk · 1 year, 10 months ago
- 571f28a Adding task system utilization tracing. (#13941) by Ben Vanik · 1 year, 10 months ago
- df71589 [StableHLO] Migrate samples to StableHLO (#13916) by Jakub Kuderski · 1 year, 10 months ago
- 4a7980f Lower stablehlo.custom_call @TopK to chlo.top_k (#13937) by Rob Suderman · 1 year, 10 months ago
- 37edc2f Let `iree_add_all_subdirs` be a macro (#13948) by bjacob · 1 year, 10 months ago
- 5efe2d7 build cleanups (#13947) by bjacob · 1 year, 10 months ago
- a1125b8 Add a simple GPU barrier removal transform op (#13886) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
- 7400c85 Remove accidentally added TEST_FILE (#13940) by Jerry Wu · 1 year, 10 months ago
- 60b623a Improving iree_arena_t/iree_resource_set_t ASAN debugging. (#13939) by Ben Vanik · 1 year, 10 months ago
- 82577da Clean up build_linux_packages.sh to address comments by Anush Elangovan · 1 year, 10 months ago
- 3eeb659 Bring your own bitcode (#13930) by bjacob · 1 year, 10 months ago
- d01e2d1 Adding constant storage size estimate to stream statistics. (#13885) by Ben Vanik · 1 year, 10 months ago
- e791af1 Add AArch64 builds and move to manylinux_2_28 (#13831) by powderluv · 1 year, 10 months ago
- f1a356e Remove compiler/Codegen/Sandbox from CODEOWNERS (#13935) by Han-Chung Wang · 1 year, 10 months ago
- 5b0c36e [NFC] Switch to_vector(map_range(...)) to map_to_vector (#13931) by Han-Chung Wang · 1 year, 10 months ago
- efb045e [StableHLO] Migrate e2e models to StableHLO (#13929) by Jakub Kuderski · 1 year, 10 months ago
- f2bf82d Integrate llvm-project at dc63b35b0223 (#13919) by Han-Chung Wang · 1 year, 10 months ago
- 224caae [StableHLO] Migrate regression tests and microbenchmarks to StableHLO (#13928) by Jakub Kuderski · 1 year, 10 months ago
- 7f40b0e Update nvidia driver on host image to 530 (#13918) by Jerry Wu · 1 year, 10 months ago
- 812eac9 [StableHLO] Migrate vulkan-specific tests to StableHLO (#13925) by Jakub Kuderski · 1 year, 10 months ago
- d30cc47 legalize ui32 for collective ops (#13911) by Okwan Kwon · 1 year, 10 months ago
- e23561d Improvements to target CPU features variants in e2e tests (#13915) by bjacob · 1 year, 10 months ago
- 1038648 [TransformStrategies] Add support for aligned and partially aligned matmul (#13541) by Quinn Dawkins · 1 year, 10 months ago
- 7b1451b Generate model definitions from batch sizes (#13879) by Jerry Wu · 1 year, 10 months ago
- f84d8a8 Add --iree-codegen-linalg-max-constant-fold-elements= flag. (#13909) by Ben Vanik · 1 year, 10 months ago
- cd958f2 [ci] Update base docker image to use Ubuntu 20.04 (#13907) by Lei Zhang · 1 year, 10 months ago
- f4f8e91 CMake simplifications around `iree_check_test` and `iree_trace_runner_test` (#13889) by bjacob · 1 year, 10 months ago
- c1123a5 Integrate llvm-project at 217709cbae34 (#13903) by Jerry Wu · 1 year, 10 months ago
- dc9bd09 Loosen widen to work between differing types (int vs fp) (#13900) by Rob Suderman · 1 year, 10 months ago
- c91f2bf Add *-long preset options to benchmark-extra PR trailer (#13884) by Jerry Wu · 1 year, 10 months ago
- 1ca3171 Fix mhlo.all_reduce and stablehlo.reduce for uint (#13899) by Rob Suderman · 1 year, 10 months ago
- 4a9e22f [StableHLO] Add pass to convert from MHLO to StableHLO (#13896) by Jakub Kuderski · 1 year, 10 months ago
- 9043e05 Fix windows e2e tests (#13895) by bjacob · 1 year, 10 months ago
- 412b1df Add auto input pipeline conversion detection to python bindings (#13892) by Jakub Kuderski · 1 year, 10 months ago
- dae6bac Constant propagate extf involving vector<bf16> post arith conversion (#13772) by Rob Suderman · 1 year, 10 months ago
- 50b55e3 [gpu] Enable fusing input producers after tiling reduction loops (#13806) by Lei Zhang · 1 year, 10 months ago
- 0d81062 Fixes and simplifications to CPU-data handling (#13881) by bjacob · 1 year, 10 months ago
- 9cf4f91 Integrate llvm-project at b9e328fd9113 (#13883) by Han-Chung Wang · 1 year, 10 months ago
- 6dd687d Cleanup workarounds for Python < 3.8 (#13882) by Jerry Wu · 1 year, 10 months ago