1. e737184 Tag CTS tests with the driver they use (#14170) by Geoffrey Martin-Noble · 1 year, 9 months ago
  2. be24f02 Use Black to format Python files (#14161) by Jakub Kuderski · 1 year, 9 months ago
  3. 05c9b0d Use cuGetProcAddress to load CUDA entry points (#14056) by Trevor Morris · 1 year, 9 months ago
  4. df119bd Test the bring-your-own-LLVM path. (#14035) by bjacob · 1 year, 9 months ago
  5. 60b0764 Allow defining `IREE_HOST_SIZE_T` to other types. (#14040) by Scott Todd · 1 year, 9 months ago
  6. 028acfb [metal] Improve error handling in command buffer create/destroy by Lei Zhang · 1 year, 10 months ago
  7. 7c82a3d [metal] Avoid resource set leak in queue execution by Lei Zhang · 1 year, 10 months ago
  8. 3097b3a [metal] Use pipeline layout to query set and binding count by Lei Zhang · 1 year, 10 months ago
  9. 9c384b6 [metal] Unify pipeline object creation in MTLLibrary and source paths by Lei Zhang · 1 year, 10 months ago
  10. f7d9642 [metal] Use the last command buffer for semaphore signaling by Lei Zhang · 1 year, 10 months ago
  11. 1fa5da6 [metal] Use one resource set to handle queue execution resources by Lei Zhang · 1 year, 10 months ago
  12. eba9f5a [metal] Manage staging buffer refcount in command buffer lifetime by Lei Zhang · 1 year, 10 months ago
  13. d7fb981 [metal] Use the kernel layout to query push constant count by Lei Zhang · 1 year, 10 months ago
  14. e8679ad [metal] NFC: Make code in buffer fill less branchy by Lei Zhang · 1 year, 10 months ago
  15. ec93093 [metal] Cache a command buffer descriptor in device to deduplicate by Lei Zhang · 1 year, 10 months ago
  16. 02c14aa [metal] Improve order in device creation by Lei Zhang · 1 year, 10 months ago
  17. 52a8d0c [metal] Return early with IREE macro to flatten status check by Lei Zhang · 1 year, 10 months ago
  18. 4faf549 [metal] Check and return failure earlier in buffer allocation by Lei Zhang · 1 year, 10 months ago
  19. bba889c [metal] Keep track of queue in buffer construction for macOS by Lei Zhang · 1 year, 10 months ago
  20. 54818a3 [metal] Order host_allocator and const-ify various query APIs by Lei Zhang · 1 year, 10 months ago
  21. 6f32048 [metal] Use string view and byte span for compilation functions by Lei Zhang · 1 year, 10 months ago
  22. d90e80f [metal] Use separate lists for different descriptor sets by Lei Zhang · 1 year, 11 months ago
  23. d4aef98 [metal] NFC: Create struct for state-related command buffer fields by Lei Zhang · 1 year, 11 months ago
  24. e32cfa6 [metal] Add some TODOs for expected changes in command buffer by Lei Zhang · 1 year, 11 months ago
  25. aba08b4 [metal] Add technical details README file by Lei Zhang · 1 year, 11 months ago
  26. f598fd2 [metal] Use staging buffer for argument buffers and update sources by Lei Zhang · 1 year, 11 months ago
  27. 0ec791e [metal] Construct argument buffer at dispatch recording time by Lei Zhang · 1 year, 11 months ago
  28. 4307ba2 [metal] Drop some unnecessary error checks by Lei Zhang · 1 year, 11 months ago
  29. c0ad0ea [metal] Switch to use command segments for recording command buffer by Lei Zhang · 1 year, 11 months ago
  30. ffb40b1 [metal] Signal and wait MTLEvent for execution only barriers by Lei Zhang · 1 year, 11 months ago
  31. 30d52cd [metal] Use MTLEvent for synchronizing when switching encoders by Lei Zhang · 1 year, 11 months ago
  32. 2412d43 [metal] Implement buffer invalidate range for managed storage by Lei Zhang · 1 year, 11 months ago
  33. 4ce8551 [metal] Tidy up buffer compatibility and storage mode management by Lei Zhang · 1 year, 11 months ago
  34. ab5fed3 [metal] Use iree_status_t annotation for compute pipeline errors by Lei Zhang · 2 years ago
  35. c4d9cda [metal] Use resource set to manage wait/signal semaphores by Lei Zhang · 2 years ago
  36. a56531f [metal] Return iree_ok_status for device queue flush by Lei Zhang · 2 years ago
  37. f231e81 [metal] Drop IREE HAL event via Metal fence implementation by Lei Zhang · 2 years ago
  38. ab848fc [metal] Enable real async execution on GPU by Lei Zhang · 2 years ago
  39. 59e67a7 [metal] Enable compiling to Metal library when possible by Lei Zhang · 2 years ago
  40. 6e93f00 [metal] Retain MTLSharedEvent in wait/signal command buffers by Lei Zhang · 2 years ago
  41. 696145c [metal] Upload initial data for non-shared storage mode buffers by Lei Zhang · 2 years ago
  42. 0adc461 [metal] Use MTLFence to synchronize encoders in command buffer by Lei Zhang · 2 years, 1 month ago
  43. 3f64a11 [metal] Add option for strong resource reference in command buffers by Lei Zhang · 2 years, 1 month ago
  44. a658aa5 [metal] Specify dispatch queue QoS class as USER_INITIATED by Lei Zhang · 2 years, 1 month ago
  45. 6c2477b [metal] Add option to enable resource hazard tracking by Lei Zhang · 2 years, 1 month ago
  46. 64588d6 [metal] Support profiling via GPU frame captures by Lei Zhang · 2 years, 1 month ago
  47. 1787641 [metal] Add option to allow serial command dispatch for debugging by Lei Zhang · 2 years, 1 month ago
  48. 77da1d2 [metal] Eanble passing StableHLO/TOSA op end-to-end tests by Lei Zhang · 2 years, 1 month ago
  49. 59c4699 [metal] Add builtin executable for copy unaligned buffers by Lei Zhang · 2 years, 1 month ago
  50. 91bd464 [metal] Use semaphore for implementing queue alloca/dealloca by Lei Zhang · 2 years, 1 month ago
  51. ecd5573 [metal] Delete explicit CTS test lists given all are passing now by Lei Zhang · 2 years, 1 month ago
  52. 182db9d [metal] Add builtin executables for polyfilling buffer fills by Lei Zhang · 2 years, 1 month ago
  53. c7d0346 [metal] Implement IREE event APIs with Metal fence by Lei Zhang · 2 years, 1 month ago
  54. 063e1fa [metal] Implement command buffer fill/copy/update buffer cases by Lei Zhang · 2 years, 1 month ago
  55. 30f6a39 [metal] Support push constants in compilers and runtime by Lei Zhang · 2 years, 1 month ago
  56. df1e9a2 [metal] Support one-shot command buffer dispatch and barrier APIs by Lei Zhang · 2 years, 1 month ago
  57. ca2f738 [metal] Add device parameters to driver/device creation APIs by Lei Zhang · 2 years, 1 month ago
  58. a1defb4 [metal] Support loading executables and compiling kernels by Lei Zhang · 2 years, 1 month ago
  59. 9b47492 [metal] Implement descriptor set and pipeline layout APIs by Lei Zhang · 2 years, 1 month ago
  60. 33df9dd [metal] Use Metal shared event to implement IREE semaphore APIs by Lei Zhang · 2 years, 2 months ago
  61. f9802f0 [metal] Wire up creating devices, allocators, and buffers by Lei Zhang · 2 years, 2 months ago
  62. 8a2330f [metal] Implement Metal allocator and buffer APIs by Lei Zhang · 2 years, 2 months ago
  63. 8f36024 [metal] Add build and registration for a Metal HAL driver by Lei Zhang · 2 years, 2 months ago
  64. 85eb21b [cuda] Port over tracing utilities and use in NCCL channel (#14063) by Lei Zhang · 1 year, 10 months ago
  65. 85a1a56 [cuda] Port over native executable and its cache (#14062) by Lei Zhang · 1 year, 10 months ago
  66. 330771e [cuda] Port over channel implementation via NCCL (#14059) by Lei Zhang · 1 year, 10 months ago
  67. 14308b1 Cleaning up the iree/base/tracing.h header a bit. (#14089) by Ben Vanik · 1 year, 10 months ago
  68. 31c2d24 [cuda] Dump more synchronization related attributes (#14074) by Lei Zhang · 1 year, 10 months ago
  69. e24089d [WebGPU] Async, loop-based invoke and output. (#13962) by Scott Todd · 1 year, 10 months ago
  70. 45a3eb4 [cuda] Port over descriptor set and pipeline layout (#14038) by Lei Zhang · 1 year, 10 months ago
  71. cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 10 months ago
  72. c4e01e9 [cuda] Wire up basic creating devices, allocators, and buffers (#14011) by Lei Zhang · 1 year, 10 months ago
  73. e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 10 months ago
  74. 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 10 months ago
  75. 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 10 months ago
  76. 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 10 months ago
  77. 975ba03 Adds support for mixed precision NVIDIA A100 Tensor Cores (F32 <= F16 * F16 + F32) (#13857) by Manish Gupta · 1 year, 10 months ago
  78. 6a61d9f [cuda] Dump whether the device has integrated memory (#13986) by Lei Zhang · 1 year, 10 months ago
  79. 1299742 [cuda] Port over allocator and buffer implementation (#13985) by Lei Zhang · 1 year, 10 months ago
  80. 5da8c71 Add `iree-stream-resource-alias-mutable-bindings` flag. (#13965) by Scott Todd · 1 year, 10 months ago
  81. 6d60a12 Add WebGPU sample application and update other web demos. by Scott Todd · 1 year, 10 months ago
  82. e7c2cba Initial WebGPU HAL implementation. by Ben Vanik · 3 years, 8 months ago
  83. 508247f Add CODEOWNERS for experimental directories (#13956) by Geoffrey Martin-Noble · 1 year, 10 months ago
  84. 686860c [cuda] Dump useful GPU characteristics (#13955) by Lei Zhang · 1 year, 10 months ago
  85. 6ab4570 [cuda] NFC: Split files for CUDA and NCCL dynamic symbols (#13954) by Lei Zhang · 1 year, 10 months ago
  86. 5c38bcc [cuda] Implement basics for a CUDA HAL driver rewrite (#13942) by Lei Zhang · 1 year, 10 months ago
  87. df71589 [StableHLO] Migrate samples to StableHLO (#13916) by Jakub Kuderski · 1 year, 10 months ago
  88. 950e172 Deprecate MHLO input conversion pipeline (#13870) by Jakub Kuderski · 1 year, 10 months ago
  89. 42be722 fix the `cpu_ukernel` standalone plugin: was missing `weak.c` (#13850) by bjacob · 1 year, 10 months ago
  90. 67555c0 Drop conditionals and configured headers from the ukernels build (#13834) by bjacob · 1 year, 10 months ago
  91. 29647b3 CPU ukernels as bitcode (x86-only for now) (#13460) by MaheshRavishankar · 1 year, 10 months ago
  92. c279dbd Initial IREE Dispatch Profiler workflow (#13585) by Julian Walker · 1 year, 10 months ago
  93. e7b8111 Swapping context/params order on CPU import functions. (#13600) by Ben Vanik · 1 year, 11 months ago
  94. 615fddf Let ukernel plugin entry points alias the underlying ukernel (#13597) by bjacob · 1 year, 11 months ago
  95. bb21d92 Fix many broken links across code and docs. (#13592) by Scott Todd · 1 year, 11 months ago
  96. a1d7e66 Finish removing `iree-tools-xla` Python package. (#13563) by Scott Todd · 1 year, 11 months ago
  97. e2aa9f2 [IREE dispatch profiler] Separate iree build folder and generated folder (#13536) by Manish Gupta · 1 year, 11 months ago
  98. cb31cbd Adding Batch Matmul and Matmul with Split-K to IREE Dispatch Profiler (#13396) by Manish Gupta · 1 year, 11 months ago
  99. 3dc368e Builtin ukernels as system/standalone plugins (#13433) by bjacob · 1 year, 11 months ago
  100. 75cbdf8 Removing iree_hal_command_buffer_dyn_cast from the HAL. (#13408) by Ben Vanik · 1 year, 11 months ago