1. d71c147 Refresh website branding. (#16151) by Scott Todd · 1 year, 2 months ago
  2. c859e29 Fix web and Colab sample CI builds. (#16155) by Scott Todd · 1 year, 2 months ago
  3. 51c30ab e2e microkernel pipeline + argmax ukernel on ROCM backend. (#15943) by Stanley Winata · 1 year, 2 months ago
  4. ddccda0 [HIP] Add macro for HIP build deps update (#16123) by Nithin Meganathan · 1 year, 2 months ago
  5. 171e31c [cuda] Move to hal/drivers and wire up BUILD files (#14620) by Lei Zhang · 1 year, 3 months ago
  6. 74d1f01 [cuda] Break cyclic retain between device and device event pool (#16088) by Lei Zhang · 1 year, 3 months ago
  7. 381a16c [cuda] Fix deadlock when advancing deferred queue in driver thread (#15673) by Lei Zhang · 1 year, 3 months ago
  8. 182a8f3 [HIP] Adds graph command buffer & descriptor set and pipeline layout (#15910) by Nithin Meganathan · 1 year, 3 months ago
  9. b92ceb4 Remove SYSTEM scope from transitive includes. (#16018) by Stella Laurenzo · 1 year, 3 months ago
  10. f97aa4d [HIP] Adds support for native executable and cache (#15937) by Nithin Meganathan · 1 year, 3 months ago
  11. b5f1a83 Fix experimental/web[gpu] builds after HAL changes. (#15925) by Scott Todd · 1 year, 3 months ago
  12. f81f361 Removing transfer_range from the HAL device vtable. (#15919) by Ben Vanik · 1 year, 3 months ago
  13. 6e811ff [HIP] Adds support for creating devices (#15887) by Nithin Meganathan · 1 year, 3 months ago
  14. 62c4f98 Replacing cpuinfo on Mac and adding support for E/P cores. (#15891) by Ben Vanik · 1 year, 3 months ago
  15. 5744284 [cuda] Remove redundant memory initialization during device creation (#15899) by Nithin Meganathan · 1 year, 3 months ago
  16. bb91946 [HIP] Adds buffer and allocator implementations (#15791) by Nithin Meganathan · 1 year, 4 months ago
  17. 38878ee Relax NCCL version constraints (#14633) by Boian Petkantchin · 1 year, 4 months ago
  18. ef51eb6 [HIP] Adds basics to implement HIP HAL driver (#15506) by Nithin Meganathan · 1 year, 4 months ago
  19. e799ae9 Adjust pkgci logging. (#15609) by Scott Todd · 1 year, 4 months ago
  20. 2eda767 Migrate tests and benchmarks from `--iree-llvmcpu-enable-microkernels` to `--iree-llvmcpu-enable-ukernels` (#15584) by bjacob · 1 year, 4 months ago
  21. 643b467 [cuda] Add command-line option to drop legacy sync mode (#15582) by Lei Zhang · 1 year, 4 months ago
  22. 522fac0 [cuda] Avoid sorting descriptors in stream command buffer (#15437) by Lei Zhang · 1 year, 4 months ago
  23. f2e3260 Create oneshot stream command buffer in pending_queue_actions by Lei Zhang · 1 year, 5 months ago
  24. 618c835 [cuda] Port over CUDA stream-based command buffer impl by Lei Zhang · 1 year, 5 months ago
  25. e3671a5 NFC: Rename cuda to cuda2 by Lei Zhang · 1 year, 5 months ago
  26. a02ff0e NFC: Copy over existing stream command buffer impl by Lei Zhang · 1 year, 5 months ago
  27. 573f5e9 Merge docs/developers into docs/website/. (#15396) by Scott Todd · 1 year, 5 months ago
  28. 6bbdb72 [cuda] Mark event related APIs as unimplemented (#15382) by Lei Zhang · 1 year, 5 months ago
  29. fd9cd2f Fix some minspec/optional feature bitrot. (#15378) by Stella Laurenzo · 1 year, 5 months ago
  30. 5223596 [cuda] Support building node DAG in graph command buffer (#14857) by Eugene Zhulenev · 1 year, 5 months ago
  31. 2706526 [ROCM] add device path and use it to setup device (#15234) by nirvedhmeshram · 1 year, 5 months ago
  32. 7b92a6d [cuda] Avoid sorting when composing kernel arguments (#15325) by Lei Zhang · 1 year, 5 months ago
  33. 8c34b97 Use custom iree.dev domain in links to documentation site. (#15036) by Scott Todd · 1 year, 5 months ago
  34. 63381a8 Switching external resources to be device-local only. (#14016) by Ben Vanik · 1 year, 5 months ago
  35. 82611a9 Making execution region results queue-ordered allocas. (#15149) by Ben Vanik · 1 year, 5 months ago
  36. cda49ca [rocm] Print GPU information when dumping device info (#15230) by Lei Zhang · 1 year, 5 months ago
  37. ebdb098 [experimental][regression] Add ROCM Regression test. (#14861) by Stanley Winata · 1 year, 5 months ago
  38. e023ea7 [rocm] Bundle HIP headers into a submodule and use that by default. (#15186) by Stella Laurenzo · 1 year, 5 months ago
  39. dd26475 [PkgCI] Add recipe for correctness on CPU (#15131) by Kunwar Grover · 1 year, 6 months ago
  40. 9e9aff0 [PkgCI] Add llama2_7b_i4 recipe for correctness testing on cuda (#15113) by Kunwar Grover · 1 year, 6 months ago
  41. c64b31f [PkgCI] Add tqdm bar while downloading artifacts (#15112) by Kunwar Grover · 1 year, 6 months ago
  42. 98d4f18 [PkgCI] Add llama2 recipe for NVIDIA A100 (#15093) by Kunwar Grover · 1 year, 6 months ago
  43. 3add457 Adding iree_io_file_handle_t placeholder. (#15101) by Ben Vanik · 1 year, 6 months ago
  44. ad64ecc [experimental][ROCM] Add shared memory support on ROCM RT and Target. (#15097) by Stanley Winata · 1 year, 6 months ago
  45. 1a63564 Refactor IREECodegenAttrs to use typed array parameters (#15032) by Benjamin Maxwell · 1 year, 6 months ago
  46. 04259d0 Integrate llvm-project at f66cd9e9556a53142a26a5c21a72e21f1579217c (#14980) by Stella Laurenzo · 1 year, 6 months ago
  47. dc503e2 Experimental distributed Python API (#14641) by Boian Petkantchin · 1 year, 7 months ago
  48. de4f4e5 update llama artifact (#14915) by Daniel Garvey · 1 year, 7 months ago
  49. 647a52e [cuda] Fix event_pool reference counting (#14900) by Eugene Zhulenev · 1 year, 7 months ago
  50. babd4d9 [cuda] Optimize device signal to host wait synchronization (#14876) by Lei Zhang · 1 year, 7 months ago
  51. 6e04816 [cuda] Fix segfault caused by CUevent outliving CUdevice (#14875) by Lei Zhang · 1 year, 7 months ago
  52. 15406a4 [experimental][rocm] Added tracing for rocm backend. (#14852) by Stanley Winata · 1 year, 7 months ago
  53. 585e5ca Move demotion passes to GlobalOptimization. (#14815) by Stella Laurenzo · 1 year, 7 months ago
  54. 82be925 Adding iree_hal_device_profiling_flush. (#14829) by Ben Vanik · 1 year, 7 months ago
  55. 85a5425 [cuda2] Fix device wait even leak in event semaphore (#14825) by Eugene Zhulenev · 1 year, 7 months ago
  56. eba7eac [cuda] Fix include to include the header file (#14813) by Lei Zhang · 1 year, 7 months ago
  57. 7d5f1d5 [cuda] Initialize resource immediately after allocation (#14812) by Lei Zhang · 1 year, 7 months ago
  58. 10f9e61 [cuda] Remove if ok status check nesting when possible (#14811) by Lei Zhang · 1 year, 7 months ago
  59. b76b6df Initial commit of package based CI and regression testing. (#14793) by Stella Laurenzo · 1 year, 7 months ago
  60. d016fac Cleanup references to Buildkite. (#14748) by Scott Todd · 1 year, 7 months ago
  61. f726c87 Build sample_webgpu in build_test_test_samples. (#14690) by Scott Todd · 1 year, 7 months ago
  62. 0056918 [ROCm] add CMake options to specify additional compile arguments for conformance tests by Boian Petkantchin · 1 year, 7 months ago
  63. 83e9839 [ROCm] Fix creation/destruction of HAL executable by Boian Petkantchin · 1 year, 7 months ago
  64. 1ad7390 [ROCm] fix driver name in conformance tests by Boian Petkantchin · 1 year, 8 months ago
  65. 5b72a02 Fixing Metal/ROCM buffer mapping validation. (#14699) by Ben Vanik · 1 year, 7 months ago
  66. 42b983c Removing initial_data from iree_hal_allocator_allocate_buffer. (#14674) by Ben Vanik · 1 year, 7 months ago
  67. f022d29 Reworking constant upload with a HAL file API. (#14665) by Ben Vanik · 1 year, 7 months ago
  68. e44ba91 A few fixes for compiling experimental/cuda2. (#14667) by Scott Todd · 1 year, 7 months ago
  69. 038d2ba Work around scalar support bug in webgpu compilation. (#14629) by Scott Todd · 1 year, 8 months ago
  70. d1d03cb [metal] Move to hal/drivers and default build for Apple silicon (#14129) by Lei Zhang · 1 year, 8 months ago
  71. b41df2f [metal] Retain iree_hal_device_t in command buffers (#14588) by Lei Zhang · 1 year, 8 months ago
  72. 0c58388 [cuda] Use proper symbols to fix compilation errors (#14523) by Lei Zhang · 1 year, 8 months ago
  73. 80d06c1 [CUDA] Export external CUDA buffers to external device allocation (#14491) by Eugene Zhulenev · 1 year, 8 months ago
  74. 8fe8689 [cuda] NFC: Remove no-op semaphore implementation (#14484) by Lei Zhang · 1 year, 8 months ago
  75. 1bb26cb [cuda] Implement HAL semaphore using CUevent objects (#14426) by Lei Zhang · 1 year, 8 months ago
  76. 280b14b Remove explicit Bazel public visibility from targets. (#14419) by Scott Todd · 1 year, 8 months ago
  77. dad3f33 [cuda] Enable various HAL CTS and e2e single-op tests (#14327) by Lei Zhang · 1 year, 9 months ago
  78. 69a5481 [cuda] Port over existing graph command buffer impl (#14326) by Lei Zhang · 1 year, 9 months ago
  79. 04beef2 [cuda] Port over existing semaphore impl (#14325) by Lei Zhang · 1 year, 9 months ago
  80. fc521f9 Improve buffer handling in WebGPU sample. (#14163) by Scott Todd · 1 year, 9 months ago
  81. 69eb9ca Remove the ukernels standalone plugin (#14339) by bjacob · 1 year, 9 months ago
  82. 5d96935 Fix warnings that `${arch}-unknown-unknown-eabi-elf` was not a correct triple (#14340) by bjacob · 1 year, 9 months ago
  83. fe87604 [metal] NFC: Simplify file names by dropping `metal_` prefix (#14270) by Lei Zhang · 1 year, 9 months ago
  84. e737184 Tag CTS tests with the driver they use (#14170) by Geoffrey Martin-Noble · 1 year, 9 months ago
  85. be24f02 Use Black to format Python files (#14161) by Jakub Kuderski · 1 year, 9 months ago
  86. 05c9b0d Use cuGetProcAddress to load CUDA entry points (#14056) by Trevor Morris · 1 year, 9 months ago
  87. df119bd Test the bring-your-own-LLVM path. (#14035) by bjacob · 1 year, 9 months ago
  88. 60b0764 Allow defining `IREE_HOST_SIZE_T` to other types. (#14040) by Scott Todd · 1 year, 9 months ago
  89. 028acfb [metal] Improve error handling in command buffer create/destroy by Lei Zhang · 1 year, 10 months ago
  90. 7c82a3d [metal] Avoid resource set leak in queue execution by Lei Zhang · 1 year, 10 months ago
  91. 3097b3a [metal] Use pipeline layout to query set and binding count by Lei Zhang · 1 year, 10 months ago
  92. 9c384b6 [metal] Unify pipeline object creation in MTLLibrary and source paths by Lei Zhang · 1 year, 10 months ago
  93. f7d9642 [metal] Use the last command buffer for semaphore signaling by Lei Zhang · 1 year, 10 months ago
  94. 1fa5da6 [metal] Use one resource set to handle queue execution resources by Lei Zhang · 1 year, 10 months ago
  95. eba9f5a [metal] Manage staging buffer refcount in command buffer lifetime by Lei Zhang · 1 year, 10 months ago
  96. d7fb981 [metal] Use the kernel layout to query push constant count by Lei Zhang · 1 year, 10 months ago
  97. e8679ad [metal] NFC: Make code in buffer fill less branchy by Lei Zhang · 1 year, 10 months ago
  98. ec93093 [metal] Cache a command buffer descriptor in device to deduplicate by Lei Zhang · 1 year, 10 months ago
  99. 02c14aa [metal] Improve order in device creation by Lei Zhang · 1 year, 10 months ago
  100. 52a8d0c [metal] Return early with IREE macro to flatten status check by Lei Zhang · 1 year, 10 months ago