1. 45a3eb4 [cuda] Port over descriptor set and pipeline layout (#14038) by Lei Zhang · 1 year, 10 months ago
  2. 2dddf02 Correctly tag Vulkan Ampere tests as requiring sm80 (#14039) by Geoffrey Martin-Noble · 1 year, 10 months ago
  3. f71aebc [StableHLO] Make reduce lowering more robust (#14046) by Jakub Kuderski · 1 year, 10 months ago
  4. 23666a6 Add a new benchmark and document steps: Add a new unaligned matmul test that will exercise failsafes to avoid bad configurations (#14052) by Nicolas Vasilache · 1 year, 10 months ago
  5. 88920b4 Drop some patterns from IREE apply_patterns op (#14053) by Matthias Springer · 1 year, 10 months ago
  6. 8357ea4 Use upstreamed ApplyPatternsOp in buildCanonicalizationAndEnablingTransforms (#14026) by Matthias Springer · 1 year, 10 months ago
  7. 807da25 [ConvertToLLVM] Don't choke on alloc of memref of index (#14002) by qcolombet · 1 year, 10 months ago
  8. cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 10 months ago
  9. 651630e [StableHLO] Port reduce canon patterns (#14045) by Jakub Kuderski · 1 year, 10 months ago
  10. fa3a220 Fix bug in TopK pattern matching that sets K as 2nd dim rather than last. (#14041) by NatashaKnk · 1 year, 10 months ago
  11. c4e01e9 [cuda] Wire up basic creating devices, allocators, and buffers (#14011) by Lei Zhang · 1 year, 10 months ago
  12. 967ab3b Adding cuda.device :: compute_capability_major/minor queries. (#14033) by Ben Vanik · 1 year, 10 months ago
  13. 7514d3e Add pattern to map iota->sort->slice to topK (#13972) by NatashaKnk · 1 year, 10 months ago
  14. e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 10 months ago
  15. b3a1ac9 Bump LLVM to llvm/llvm-project@faae4d5d (#14023) by Matthias Springer · 1 year, 10 months ago
  16. 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 10 months ago
  17. 860026f Declare exported headers to Bazel to fix `bazel build --incompatible_no_implicit_file_export` (#13982) by Levon Ter-Grigoryan · 1 year, 10 months ago
  18. 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 10 months ago
  19. c941946 Exposing device configuration higher in the stack. (#14009) by Ben Vanik · 1 year, 10 months ago
  20. 3bd115a Refresh deployment-configuration website pages. (#14013) by Scott Todd · 1 year, 10 months ago
  21. c12bf50 Harden the PadOp matcher (#14024) by Nicolas Vasilache · 1 year, 10 months ago
  22. c620ab2 Clean up CapturingOpMatcher and derived classes, NFC (#14022) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
  23. f3d0369 Add codegen strategy for GPU padding (#14000) by Nicolas Vasilache · 1 year, 10 months ago
  24. 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 10 months ago
  25. 0e03852 Test with ASAN in bytecode modules (#14005) by bjacob · 1 year, 10 months ago
  26. 6a87b03 Add tags to website pages. (#14006) by Scott Todd · 1 year, 10 months ago
  27. 38ae184 _MSC_VER comparison was the wrong way (#14007) by bjacob · 1 year, 10 months ago
  28. 975ba03 Adds support for mixed precision NVIDIA A100 Tensor Cores (F32 <= F16 * F16 + F32) (#13857) by Manish Gupta · 1 year, 10 months ago
  29. 6a61d9f [cuda] Dump whether the device has integrated memory (#13986) by Lei Zhang · 1 year, 10 months ago
  30. ef52dbf Restructure website sections and navigation. (#13991) by Scott Todd · 1 year, 10 months ago
  31. d9b9471 Retain the parent channel on a split in iree_hal_nccl_channel_t. (#13977) by Ben Vanik · 1 year, 10 months ago
  32. 8f9e962 Adding support for async memory pool allocations in the CUDA HAL. (#13440) by Ben Vanik · 1 year, 10 months ago
  33. aced620 Lowering flow.tensor.alloca (renamed) to stream.async.alloca. (#13998) by Ben Vanik · 1 year, 10 months ago
  34. 4e48c13 Moving builtins lower in the pipeline and adding option to force. (#13994) by Ben Vanik · 1 year, 10 months ago
  35. 1299742 [cuda] Port over allocator and buffer implementation (#13985) by Lei Zhang · 1 year, 10 months ago
  36. 0dc5de9 Route demotion flag to Input options (#13993) by Rob Suderman · 1 year, 10 months ago
  37. 377d27a Integrate llvm-project at https://github.com/llvm/llvm-project/commit/85b77b13e3bcccffeb84b09365e0ab96565467fa (#13975) by MaheshRavishankar · 1 year, 10 months ago
  38. bf3e1a2 Resetting collective batch when the CUDA command buffer arena is set. (#13978) by Ben Vanik · 1 year, 10 months ago
  39. 74f6a6a Remove redundant newlines in generated benchmark cmake files (#13979) by Jerry Wu · 1 year, 10 months ago
  40. 4c1ceb2 Make reduction/matmul/conv matchers optionally partial (#13981) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
  41. 4e9b3bd [LLVMCPU] Add pass to enable Armv9 Streaming SVE mode (#13558) by Cullen Rhodes · 1 year, 10 months ago
  42. 0b91c98 Re-enable bf16 native execution for the StableHLO Path (#13976) by Rob Suderman · 1 year, 10 months ago
  43. 815e843 Make IREEDialectsTransforms cmake target export includes (#13980) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
  44. daabcf7 Unbreak the `byo_llvm.sh` build with `iree_bitcode_library`. (#13968) by bjacob · 1 year, 10 months ago
  45. 21b41db Create `iree-benchmark-import-models-large` for large benchmarks (#13963) by Jerry Wu · 1 year, 10 months ago
  46. 5da8c71 Add `iree-stream-resource-alias-mutable-bindings` flag. (#13965) by Scott Todd · 1 year, 10 months ago
  47. 18a1bac Clean up build_linux_packages.sh to address comments (#13938) by powderluv · 1 year, 10 months ago
  48. 00dd8a3 Update build_tools/python_deploy/build_linux_packages.sh by powderluv · 1 year, 10 months ago
  49. 0f641e6 Add experimental WebGPU HAL backend. (#13952) by Scott Todd · 1 year, 10 months ago
  50. 6d60a12 Add WebGPU sample application and update other web demos. by Scott Todd · 1 year, 10 months ago
  51. e7c2cba Initial WebGPU HAL implementation. by Ben Vanik · 3 years, 9 months ago
  52. 6d6f54b Fix tests/transform_dialect/cuda/ dependencies (#13949) by Levon Ter-Grigoryan · 1 year, 10 months ago
  53. ebea998 [ci] Update CUDA toolkit to v12.1.1 (#13875) by Lei Zhang · 1 year, 10 months ago
  54. b5e45cd [ci] Update NVIDIA driver packages to v530 in docker images (#13912) by Lei Zhang · 1 year, 10 months ago
  55. 508247f Add CODEOWNERS for experimental directories (#13956) by Geoffrey Martin-Noble · 1 year, 10 months ago
  56. 3fa5ad3 [StableHLO] Port Philox rng (#13844) by jvstokes · 1 year, 10 months ago
  57. 686860c [cuda] Dump useful GPU characteristics (#13955) by Lei Zhang · 1 year, 10 months ago
  58. 6ab4570 [cuda] NFC: Split files for CUDA and NCCL dynamic symbols (#13954) by Lei Zhang · 1 year, 10 months ago
  59. ad321b6 Rollup of HAL/runtime/infra changes for WebGPU HAL. (#13953) by Scott Todd · 1 year, 10 months ago
  60. 5c38bcc [cuda] Implement basics for a CUDA HAL driver rewrite (#13942) by Lei Zhang · 1 year, 10 months ago
  61. 2544efe Rename long-running to large in benchmark suite and workflows (#13914) by Jerry Wu · 1 year, 10 months ago
  62. fc93e91 Add LinalgExt TypePropagation pattern that handles i1 inputs/outputs (#13936) by NatashaKnk · 1 year, 10 months ago
  63. 571f28a Adding task system utilization tracing. (#13941) by Ben Vanik · 1 year, 10 months ago
  64. df71589 [StableHLO] Migrate samples to StableHLO (#13916) by Jakub Kuderski · 1 year, 10 months ago
  65. 4a7980f Lower stablehlo.custom_call @TopK to chlo.top_k (#13937) by Rob Suderman · 1 year, 10 months ago
  66. 37edc2f Let `iree_add_all_subdirs` be a macro (#13948) by bjacob · 1 year, 10 months ago
  67. 5efe2d7 build cleanups (#13947) by bjacob · 1 year, 10 months ago
  68. a1125b8 Add a simple GPU barrier removal transform op (#13886) by Oleksandr "Alex" Zinenko · 1 year, 10 months ago
  69. 7400c85 Remove accidentally added TEST_FILE (#13940) by Jerry Wu · 1 year, 10 months ago
  70. 60b623a Improving iree_arena_t/iree_resource_set_t ASAN debugging. (#13939) by Ben Vanik · 1 year, 10 months ago
  71. 82577da Clean up build_linux_packages.sh to address comments by Anush Elangovan · 1 year, 10 months ago
  72. 3eeb659 Bring your own bitcode (#13930) by bjacob · 1 year, 10 months ago
  73. d01e2d1 Adding constant storage size estimate to stream statistics. (#13885) by Ben Vanik · 1 year, 10 months ago
  74. e791af1 Add AArch64 builds and move to manylinux_2_28 (#13831) by powderluv · 1 year, 10 months ago
  75. f1a356e Remove compiler/Codegen/Sandbox from CODEOWNERS (#13935) by Han-Chung Wang · 1 year, 10 months ago
  76. 5b0c36e [NFC] Switch to_vector(map_range(...)) to map_to_vector (#13931) by Han-Chung Wang · 1 year, 10 months ago
  77. efb045e [StableHLO] Migrate e2e models to StableHLO (#13929) by Jakub Kuderski · 1 year, 10 months ago
  78. f2bf82d Integrate llvm-project at dc63b35b0223 (#13919) by Han-Chung Wang · 1 year, 10 months ago
  79. 224caae [StableHLO] Migrate regression tests and microbenchmarks to StableHLO (#13928) by Jakub Kuderski · 1 year, 10 months ago
  80. 7f40b0e Update nvidia driver on host image to 530 (#13918) by Jerry Wu · 1 year, 10 months ago
  81. 812eac9 [StableHLO] Migrate vulkan-specific tests to StableHLO (#13925) by Jakub Kuderski · 1 year, 10 months ago
  82. d30cc47 legalize ui32 for collective ops (#13911) by Okwan Kwon · 1 year, 10 months ago
  83. e23561d Improvements to target CPU features variants in e2e tests (#13915) by bjacob · 1 year, 10 months ago
  84. 1038648 [TransformStrategies] Add support for aligned and partially aligned matmul (#13541) by Quinn Dawkins · 1 year, 10 months ago
  85. 7b1451b Generate model definitions from batch sizes (#13879) by Jerry Wu · 1 year, 10 months ago
  86. f84d8a8 Add --iree-codegen-linalg-max-constant-fold-elements= flag. (#13909) by Ben Vanik · 1 year, 10 months ago
  87. cd958f2 [ci] Update base docker image to use Ubuntu 20.04 (#13907) by Lei Zhang · 1 year, 10 months ago
  88. f4f8e91 CMake simplifications around `iree_check_test` and `iree_trace_runner_test` (#13889) by bjacob · 1 year, 10 months ago
  89. c1123a5 Integrate llvm-project at 217709cbae34 (#13903) by Jerry Wu · 1 year, 10 months ago
  90. dc9bd09 Loosen widen to work between differing types (int vs fp) (#13900) by Rob Suderman · 1 year, 10 months ago
  91. c91f2bf Add *-long preset options to benchmark-extra PR trailer (#13884) by Jerry Wu · 1 year, 10 months ago
  92. 1ca3171 Fix mhlo.all_reduce and stablehlo.reduce for uint (#13899) by Rob Suderman · 1 year, 10 months ago
  93. 4a9e22f [StableHLO] Add pass to convert from MHLO to StableHLO (#13896) by Jakub Kuderski · 1 year, 10 months ago
  94. 9043e05 Fix windows e2e tests (#13895) by bjacob · 1 year, 10 months ago
  95. 412b1df Add auto input pipeline conversion detection to python bindings (#13892) by Jakub Kuderski · 1 year, 10 months ago
  96. dae6bac Constant propagate extf involving vector<bf16> post arith conversion (#13772) by Rob Suderman · 1 year, 10 months ago
  97. 50b55e3 [gpu] Enable fusing input producers after tiling reduction loops (#13806) by Lei Zhang · 1 year, 10 months ago
  98. 0d81062 Fixes and simplifications to CPU-data handling (#13881) by bjacob · 1 year, 10 months ago
  99. 9cf4f91 Integrate llvm-project at b9e328fd9113 (#13883) by Han-Chung Wang · 1 year, 10 months ago
  100. 6dd687d Cleanup workarounds for Python < 3.8 (#13882) by Jerry Wu · 1 year, 10 months ago