- e737184 Tag CTS tests with the driver they use (#14170) by Geoffrey Martin-Noble · 1 year, 9 months ago
- be24f02 Use Black to format Python files (#14161) by Jakub Kuderski · 1 year, 9 months ago
- 05c9b0d Use cuGetProcAddress to load CUDA entry points (#14056) by Trevor Morris · 1 year, 9 months ago
- df119bd Test the bring-your-own-LLVM path. (#14035) by bjacob · 1 year, 9 months ago
- 60b0764 Allow defining `IREE_HOST_SIZE_T` to other types. (#14040) by Scott Todd · 1 year, 9 months ago
- 028acfb [metal] Improve error handling in command buffer create/destroy by Lei Zhang · 1 year, 10 months ago
- 7c82a3d [metal] Avoid resource set leak in queue execution by Lei Zhang · 1 year, 10 months ago
- 3097b3a [metal] Use pipeline layout to query set and binding count by Lei Zhang · 1 year, 10 months ago
- 9c384b6 [metal] Unify pipeline object creation in MTLLibrary and source paths by Lei Zhang · 1 year, 10 months ago
- f7d9642 [metal] Use the last command buffer for semaphore signaling by Lei Zhang · 1 year, 10 months ago
- 1fa5da6 [metal] Use one resource set to handle queue execution resources by Lei Zhang · 1 year, 10 months ago
- eba9f5a [metal] Manage staging buffer refcount in command buffer lifetime by Lei Zhang · 1 year, 10 months ago
- d7fb981 [metal] Use the kernel layout to query push constant count by Lei Zhang · 1 year, 10 months ago
- e8679ad [metal] NFC: Make code in buffer fill less branchy by Lei Zhang · 1 year, 10 months ago
- ec93093 [metal] Cache a command buffer descriptor in device to deduplicate by Lei Zhang · 1 year, 10 months ago
- 02c14aa [metal] Improve order in device creation by Lei Zhang · 1 year, 10 months ago
- 52a8d0c [metal] Return early with IREE macro to flatten status check by Lei Zhang · 1 year, 10 months ago
- 4faf549 [metal] Check and return failure earlier in buffer allocation by Lei Zhang · 1 year, 10 months ago
- bba889c [metal] Keep track of queue in buffer construction for macOS by Lei Zhang · 1 year, 10 months ago
- 54818a3 [metal] Order host_allocator and const-ify various query APIs by Lei Zhang · 1 year, 10 months ago
- 6f32048 [metal] Use string view and byte span for compilation functions by Lei Zhang · 1 year, 10 months ago
- d90e80f [metal] Use separate lists for different descriptor sets by Lei Zhang · 1 year, 11 months ago
- d4aef98 [metal] NFC: Create struct for state-related command buffer fields by Lei Zhang · 1 year, 11 months ago
- e32cfa6 [metal] Add some TODOs for expected changes in command buffer by Lei Zhang · 1 year, 11 months ago
- aba08b4 [metal] Add technical details README file by Lei Zhang · 1 year, 11 months ago
- f598fd2 [metal] Use staging buffer for argument buffers and update sources by Lei Zhang · 1 year, 11 months ago
- 0ec791e [metal] Construct argument buffer at dispatch recording time by Lei Zhang · 1 year, 11 months ago
- 4307ba2 [metal] Drop some unnecessary error checks by Lei Zhang · 1 year, 11 months ago
- c0ad0ea [metal] Switch to use command segments for recording command buffer by Lei Zhang · 1 year, 11 months ago
- ffb40b1 [metal] Signal and wait MTLEvent for execution only barriers by Lei Zhang · 1 year, 11 months ago
- 30d52cd [metal] Use MTLEvent for synchronizing when switching encoders by Lei Zhang · 1 year, 11 months ago
- 2412d43 [metal] Implement buffer invalidate range for managed storage by Lei Zhang · 1 year, 11 months ago
- 4ce8551 [metal] Tidy up buffer compatibility and storage mode management by Lei Zhang · 1 year, 11 months ago
- ab5fed3 [metal] Use iree_status_t annotation for compute pipeline errors by Lei Zhang · 2 years ago
- c4d9cda [metal] Use resource set to manage wait/signal semaphores by Lei Zhang · 2 years ago
- a56531f [metal] Return iree_ok_status for device queue flush by Lei Zhang · 2 years ago
- f231e81 [metal] Drop IREE HAL event via Metal fence implementation by Lei Zhang · 2 years ago
- ab848fc [metal] Enable real async execution on GPU by Lei Zhang · 2 years ago
- 59e67a7 [metal] Enable compiling to Metal library when possible by Lei Zhang · 2 years ago
- 6e93f00 [metal] Retain MTLSharedEvent in wait/signal command buffers by Lei Zhang · 2 years ago
- 696145c [metal] Upload initial data for non-shared storage mode buffers by Lei Zhang · 2 years ago
- 0adc461 [metal] Use MTLFence to synchronize encoders in command buffer by Lei Zhang · 2 years, 1 month ago
- 3f64a11 [metal] Add option for strong resource reference in command buffers by Lei Zhang · 2 years, 1 month ago
- a658aa5 [metal] Specify dispatch queue QoS class as USER_INITIATED by Lei Zhang · 2 years, 1 month ago
- 6c2477b [metal] Add option to enable resource hazard tracking by Lei Zhang · 2 years, 1 month ago
- 64588d6 [metal] Support profiling via GPU frame captures by Lei Zhang · 2 years, 1 month ago
- 1787641 [metal] Add option to allow serial command dispatch for debugging by Lei Zhang · 2 years, 1 month ago
- 77da1d2 [metal] Eanble passing StableHLO/TOSA op end-to-end tests by Lei Zhang · 2 years, 1 month ago
- 59c4699 [metal] Add builtin executable for copy unaligned buffers by Lei Zhang · 2 years, 1 month ago
- 91bd464 [metal] Use semaphore for implementing queue alloca/dealloca by Lei Zhang · 2 years, 1 month ago
- ecd5573 [metal] Delete explicit CTS test lists given all are passing now by Lei Zhang · 2 years, 1 month ago
- 182db9d [metal] Add builtin executables for polyfilling buffer fills by Lei Zhang · 2 years, 1 month ago
- c7d0346 [metal] Implement IREE event APIs with Metal fence by Lei Zhang · 2 years, 1 month ago
- 063e1fa [metal] Implement command buffer fill/copy/update buffer cases by Lei Zhang · 2 years, 1 month ago
- 30f6a39 [metal] Support push constants in compilers and runtime by Lei Zhang · 2 years, 1 month ago
- df1e9a2 [metal] Support one-shot command buffer dispatch and barrier APIs by Lei Zhang · 2 years, 1 month ago
- ca2f738 [metal] Add device parameters to driver/device creation APIs by Lei Zhang · 2 years, 1 month ago
- a1defb4 [metal] Support loading executables and compiling kernels by Lei Zhang · 2 years, 1 month ago
- 9b47492 [metal] Implement descriptor set and pipeline layout APIs by Lei Zhang · 2 years, 1 month ago
- 33df9dd [metal] Use Metal shared event to implement IREE semaphore APIs by Lei Zhang · 2 years, 2 months ago
- f9802f0 [metal] Wire up creating devices, allocators, and buffers by Lei Zhang · 2 years, 2 months ago
- 8a2330f [metal] Implement Metal allocator and buffer APIs by Lei Zhang · 2 years, 2 months ago
- 8f36024 [metal] Add build and registration for a Metal HAL driver by Lei Zhang · 2 years, 2 months ago
- 85eb21b [cuda] Port over tracing utilities and use in NCCL channel (#14063) by Lei Zhang · 1 year, 10 months ago
- 85a1a56 [cuda] Port over native executable and its cache (#14062) by Lei Zhang · 1 year, 10 months ago
- 330771e [cuda] Port over channel implementation via NCCL (#14059) by Lei Zhang · 1 year, 10 months ago
- 14308b1 Cleaning up the iree/base/tracing.h header a bit. (#14089) by Ben Vanik · 1 year, 10 months ago
- 31c2d24 [cuda] Dump more synchronization related attributes (#14074) by Lei Zhang · 1 year, 10 months ago
- e24089d [WebGPU] Async, loop-based invoke and output. (#13962) by Scott Todd · 1 year, 10 months ago
- 45a3eb4 [cuda] Port over descriptor set and pipeline layout (#14038) by Lei Zhang · 1 year, 10 months ago
- cc43680 Cleaning up the tracing.h mechanism to enable alternative implementations. (#14044) by Ben Vanik · 1 year, 10 months ago
- c4e01e9 [cuda] Wire up basic creating devices, allocators, and buffers (#14011) by Lei Zhang · 1 year, 10 months ago
- e4c27f5 Support non-default sets of enabled LLVM CPU targets. (#13983) by bjacob · 1 year, 10 months ago
- 1b8e95c Allowing for sync allocations to be deallocated via queue-ordered deallocas. (#14029) by Ben Vanik · 1 year, 10 months ago
- 5c0fc03 Adding a fallback for when CUDA memory pools are unsupported. (#14018) by Ben Vanik · 1 year, 10 months ago
- 7f10fe2 Adding IREE_HAL_EXTERNAL_BUFFER_TYPE_DEVICE_ALLOCATION. (#14012) by Ben Vanik · 1 year, 10 months ago
- 975ba03 Adds support for mixed precision NVIDIA A100 Tensor Cores (F32 <= F16 * F16 + F32) (#13857) by Manish Gupta · 1 year, 10 months ago
- 6a61d9f [cuda] Dump whether the device has integrated memory (#13986) by Lei Zhang · 1 year, 10 months ago
- 1299742 [cuda] Port over allocator and buffer implementation (#13985) by Lei Zhang · 1 year, 10 months ago
- 5da8c71 Add `iree-stream-resource-alias-mutable-bindings` flag. (#13965) by Scott Todd · 1 year, 10 months ago
- 6d60a12 Add WebGPU sample application and update other web demos. by Scott Todd · 1 year, 10 months ago
- e7c2cba Initial WebGPU HAL implementation. by Ben Vanik · 3 years, 8 months ago
- 508247f Add CODEOWNERS for experimental directories (#13956) by Geoffrey Martin-Noble · 1 year, 10 months ago
- 686860c [cuda] Dump useful GPU characteristics (#13955) by Lei Zhang · 1 year, 10 months ago
- 6ab4570 [cuda] NFC: Split files for CUDA and NCCL dynamic symbols (#13954) by Lei Zhang · 1 year, 10 months ago
- 5c38bcc [cuda] Implement basics for a CUDA HAL driver rewrite (#13942) by Lei Zhang · 1 year, 10 months ago
- df71589 [StableHLO] Migrate samples to StableHLO (#13916) by Jakub Kuderski · 1 year, 10 months ago
- 950e172 Deprecate MHLO input conversion pipeline (#13870) by Jakub Kuderski · 1 year, 10 months ago
- 42be722 fix the `cpu_ukernel` standalone plugin: was missing `weak.c` (#13850) by bjacob · 1 year, 10 months ago
- 67555c0 Drop conditionals and configured headers from the ukernels build (#13834) by bjacob · 1 year, 10 months ago
- 29647b3 CPU ukernels as bitcode (x86-only for now) (#13460) by MaheshRavishankar · 1 year, 10 months ago
- c279dbd Initial IREE Dispatch Profiler workflow (#13585) by Julian Walker · 1 year, 10 months ago
- e7b8111 Swapping context/params order on CPU import functions. (#13600) by Ben Vanik · 1 year, 11 months ago
- 615fddf Let ukernel plugin entry points alias the underlying ukernel (#13597) by bjacob · 1 year, 11 months ago
- bb21d92 Fix many broken links across code and docs. (#13592) by Scott Todd · 1 year, 11 months ago
- a1d7e66 Finish removing `iree-tools-xla` Python package. (#13563) by Scott Todd · 1 year, 11 months ago
- e2aa9f2 [IREE dispatch profiler] Separate iree build folder and generated folder (#13536) by Manish Gupta · 1 year, 11 months ago
- cb31cbd Adding Batch Matmul and Matmul with Split-K to IREE Dispatch Profiler (#13396) by Manish Gupta · 1 year, 11 months ago
- 3dc368e Builtin ukernels as system/standalone plugins (#13433) by bjacob · 1 year, 11 months ago
- 75cbdf8 Removing iree_hal_command_buffer_dyn_cast from the HAL. (#13408) by Ben Vanik · 1 year, 11 months ago