- d71c147 Refresh website branding. (#16151) by Scott Todd · 1 year, 2 months ago
- c859e29 Fix web and Colab sample CI builds. (#16155) by Scott Todd · 1 year, 2 months ago
- 51c30ab e2e microkernel pipeline + argmax ukernel on ROCM backend. (#15943) by Stanley Winata · 1 year, 2 months ago
- ddccda0 [HIP] Add macro for HIP build deps update (#16123) by Nithin Meganathan · 1 year, 2 months ago
- 171e31c [cuda] Move to hal/drivers and wire up BUILD files (#14620) by Lei Zhang · 1 year, 3 months ago
- 74d1f01 [cuda] Break cyclic retain between device and device event pool (#16088) by Lei Zhang · 1 year, 3 months ago
- 381a16c [cuda] Fix deadlock when advancing deferred queue in driver thread (#15673) by Lei Zhang · 1 year, 3 months ago
- 182a8f3 [HIP] Adds graph command buffer & descriptor set and pipeline layout (#15910) by Nithin Meganathan · 1 year, 3 months ago
- b92ceb4 Remove SYSTEM scope from transitive includes. (#16018) by Stella Laurenzo · 1 year, 3 months ago
- f97aa4d [HIP] Adds support for native executable and cache (#15937) by Nithin Meganathan · 1 year, 3 months ago
- b5f1a83 Fix experimental/web[gpu] builds after HAL changes. (#15925) by Scott Todd · 1 year, 3 months ago
- f81f361 Removing transfer_range from the HAL device vtable. (#15919) by Ben Vanik · 1 year, 3 months ago
- 6e811ff [HIP] Adds support for creating devices (#15887) by Nithin Meganathan · 1 year, 3 months ago
- 62c4f98 Replacing cpuinfo on Mac and adding support for E/P cores. (#15891) by Ben Vanik · 1 year, 3 months ago
- 5744284 [cuda] Remove redundant memory initialization during device creation (#15899) by Nithin Meganathan · 1 year, 3 months ago
- bb91946 [HIP] Adds buffer and allocator implementations (#15791) by Nithin Meganathan · 1 year, 4 months ago
- 38878ee Relax NCCL version constraints (#14633) by Boian Petkantchin · 1 year, 4 months ago
- ef51eb6 [HIP] Adds basics to implement HIP HAL driver (#15506) by Nithin Meganathan · 1 year, 4 months ago
- e799ae9 Adjust pkgci logging. (#15609) by Scott Todd · 1 year, 4 months ago
- 2eda767 Migrate tests and benchmarks from `--iree-llvmcpu-enable-microkernels` to `--iree-llvmcpu-enable-ukernels` (#15584) by bjacob · 1 year, 4 months ago
- 643b467 [cuda] Add command-line option to drop legacy sync mode (#15582) by Lei Zhang · 1 year, 4 months ago
- 522fac0 [cuda] Avoid sorting descriptors in stream command buffer (#15437) by Lei Zhang · 1 year, 4 months ago
- f2e3260 Create oneshot stream command buffer in pending_queue_actions by Lei Zhang · 1 year, 5 months ago
- 618c835 [cuda] Port over CUDA stream-based command buffer impl by Lei Zhang · 1 year, 5 months ago
- e3671a5 NFC: Rename cuda to cuda2 by Lei Zhang · 1 year, 5 months ago
- a02ff0e NFC: Copy over existing stream command buffer impl by Lei Zhang · 1 year, 5 months ago
- 573f5e9 Merge docs/developers into docs/website/. (#15396) by Scott Todd · 1 year, 5 months ago
- 6bbdb72 [cuda] Mark event related APIs as unimplemented (#15382) by Lei Zhang · 1 year, 5 months ago
- fd9cd2f Fix some minspec/optional feature bitrot. (#15378) by Stella Laurenzo · 1 year, 5 months ago
- 5223596 [cuda] Support building node DAG in graph command buffer (#14857) by Eugene Zhulenev · 1 year, 5 months ago
- 2706526 [ROCM] add device path and use it to setup device (#15234) by nirvedhmeshram · 1 year, 5 months ago
- 7b92a6d [cuda] Avoid sorting when composing kernel arguments (#15325) by Lei Zhang · 1 year, 5 months ago
- 8c34b97 Use custom iree.dev domain in links to documentation site. (#15036) by Scott Todd · 1 year, 5 months ago
- 63381a8 Switching external resources to be device-local only. (#14016) by Ben Vanik · 1 year, 5 months ago
- 82611a9 Making execution region results queue-ordered allocas. (#15149) by Ben Vanik · 1 year, 5 months ago
- cda49ca [rocm] Print GPU information when dumping device info (#15230) by Lei Zhang · 1 year, 5 months ago
- ebdb098 [experimental][regression] Add ROCM Regression test. (#14861) by Stanley Winata · 1 year, 5 months ago
- e023ea7 [rocm] Bundle HIP headers into a submodule and use that by default. (#15186) by Stella Laurenzo · 1 year, 5 months ago
- dd26475 [PkgCI] Add recipe for correctness on CPU (#15131) by Kunwar Grover · 1 year, 6 months ago
- 9e9aff0 [PkgCI] Add llama2_7b_i4 recipe for correctness testing on cuda (#15113) by Kunwar Grover · 1 year, 6 months ago
- c64b31f [PkgCI] Add tqdm bar while downloading artifacts (#15112) by Kunwar Grover · 1 year, 6 months ago
- 98d4f18 [PkgCI] Add llama2 recipe for NVIDIA A100 (#15093) by Kunwar Grover · 1 year, 6 months ago
- 3add457 Adding iree_io_file_handle_t placeholder. (#15101) by Ben Vanik · 1 year, 6 months ago
- ad64ecc [experimental][ROCM] Add shared memory support on ROCM RT and Target. (#15097) by Stanley Winata · 1 year, 6 months ago
- 1a63564 Refactor IREECodegenAttrs to use typed array parameters (#15032) by Benjamin Maxwell · 1 year, 6 months ago
- 04259d0 Integrate llvm-project at f66cd9e9556a53142a26a5c21a72e21f1579217c (#14980) by Stella Laurenzo · 1 year, 6 months ago
- dc503e2 Experimental distributed Python API (#14641) by Boian Petkantchin · 1 year, 7 months ago
- de4f4e5 update llama artifact (#14915) by Daniel Garvey · 1 year, 7 months ago
- 647a52e [cuda] Fix event_pool reference counting (#14900) by Eugene Zhulenev · 1 year, 7 months ago
- babd4d9 [cuda] Optimize device signal to host wait synchronization (#14876) by Lei Zhang · 1 year, 7 months ago
- 6e04816 [cuda] Fix segfault caused by CUevent outliving CUdevice (#14875) by Lei Zhang · 1 year, 7 months ago
- 15406a4 [experimental][rocm] Added tracing for rocm backend. (#14852) by Stanley Winata · 1 year, 7 months ago
- 585e5ca Move demotion passes to GlobalOptimization. (#14815) by Stella Laurenzo · 1 year, 7 months ago
- 82be925 Adding iree_hal_device_profiling_flush. (#14829) by Ben Vanik · 1 year, 7 months ago
- 85a5425 [cuda2] Fix device wait even leak in event semaphore (#14825) by Eugene Zhulenev · 1 year, 7 months ago
- eba7eac [cuda] Fix include to include the header file (#14813) by Lei Zhang · 1 year, 7 months ago
- 7d5f1d5 [cuda] Initialize resource immediately after allocation (#14812) by Lei Zhang · 1 year, 7 months ago
- 10f9e61 [cuda] Remove if ok status check nesting when possible (#14811) by Lei Zhang · 1 year, 7 months ago
- b76b6df Initial commit of package based CI and regression testing. (#14793) by Stella Laurenzo · 1 year, 7 months ago
- d016fac Cleanup references to Buildkite. (#14748) by Scott Todd · 1 year, 7 months ago
- f726c87 Build sample_webgpu in build_test_test_samples. (#14690) by Scott Todd · 1 year, 7 months ago
- 0056918 [ROCm] add CMake options to specify additional compile arguments for conformance tests by Boian Petkantchin · 1 year, 7 months ago
- 83e9839 [ROCm] Fix creation/destruction of HAL executable by Boian Petkantchin · 1 year, 7 months ago
- 1ad7390 [ROCm] fix driver name in conformance tests by Boian Petkantchin · 1 year, 8 months ago
- 5b72a02 Fixing Metal/ROCM buffer mapping validation. (#14699) by Ben Vanik · 1 year, 7 months ago
- 42b983c Removing initial_data from iree_hal_allocator_allocate_buffer. (#14674) by Ben Vanik · 1 year, 7 months ago
- f022d29 Reworking constant upload with a HAL file API. (#14665) by Ben Vanik · 1 year, 7 months ago
- e44ba91 A few fixes for compiling experimental/cuda2. (#14667) by Scott Todd · 1 year, 7 months ago
- 038d2ba Work around scalar support bug in webgpu compilation. (#14629) by Scott Todd · 1 year, 8 months ago
- d1d03cb [metal] Move to hal/drivers and default build for Apple silicon (#14129) by Lei Zhang · 1 year, 8 months ago
- b41df2f [metal] Retain iree_hal_device_t in command buffers (#14588) by Lei Zhang · 1 year, 8 months ago
- 0c58388 [cuda] Use proper symbols to fix compilation errors (#14523) by Lei Zhang · 1 year, 8 months ago
- 80d06c1 [CUDA] Export external CUDA buffers to external device allocation (#14491) by Eugene Zhulenev · 1 year, 8 months ago
- 8fe8689 [cuda] NFC: Remove no-op semaphore implementation (#14484) by Lei Zhang · 1 year, 8 months ago
- 1bb26cb [cuda] Implement HAL semaphore using CUevent objects (#14426) by Lei Zhang · 1 year, 8 months ago
- 280b14b Remove explicit Bazel public visibility from targets. (#14419) by Scott Todd · 1 year, 8 months ago
- dad3f33 [cuda] Enable various HAL CTS and e2e single-op tests (#14327) by Lei Zhang · 1 year, 9 months ago
- 69a5481 [cuda] Port over existing graph command buffer impl (#14326) by Lei Zhang · 1 year, 9 months ago
- 04beef2 [cuda] Port over existing semaphore impl (#14325) by Lei Zhang · 1 year, 9 months ago
- fc521f9 Improve buffer handling in WebGPU sample. (#14163) by Scott Todd · 1 year, 9 months ago
- 69eb9ca Remove the ukernels standalone plugin (#14339) by bjacob · 1 year, 9 months ago
- 5d96935 Fix warnings that `${arch}-unknown-unknown-eabi-elf` was not a correct triple (#14340) by bjacob · 1 year, 9 months ago
- fe87604 [metal] NFC: Simplify file names by dropping `metal_` prefix (#14270) by Lei Zhang · 1 year, 9 months ago
- e737184 Tag CTS tests with the driver they use (#14170) by Geoffrey Martin-Noble · 1 year, 9 months ago
- be24f02 Use Black to format Python files (#14161) by Jakub Kuderski · 1 year, 9 months ago
- 05c9b0d Use cuGetProcAddress to load CUDA entry points (#14056) by Trevor Morris · 1 year, 9 months ago
- df119bd Test the bring-your-own-LLVM path. (#14035) by bjacob · 1 year, 9 months ago
- 60b0764 Allow defining `IREE_HOST_SIZE_T` to other types. (#14040) by Scott Todd · 1 year, 9 months ago
- 028acfb [metal] Improve error handling in command buffer create/destroy by Lei Zhang · 1 year, 10 months ago
- 7c82a3d [metal] Avoid resource set leak in queue execution by Lei Zhang · 1 year, 10 months ago
- 3097b3a [metal] Use pipeline layout to query set and binding count by Lei Zhang · 1 year, 10 months ago
- 9c384b6 [metal] Unify pipeline object creation in MTLLibrary and source paths by Lei Zhang · 1 year, 10 months ago
- f7d9642 [metal] Use the last command buffer for semaphore signaling by Lei Zhang · 1 year, 10 months ago
- 1fa5da6 [metal] Use one resource set to handle queue execution resources by Lei Zhang · 1 year, 10 months ago
- eba9f5a [metal] Manage staging buffer refcount in command buffer lifetime by Lei Zhang · 1 year, 10 months ago
- d7fb981 [metal] Use the kernel layout to query push constant count by Lei Zhang · 1 year, 10 months ago
- e8679ad [metal] NFC: Make code in buffer fill less branchy by Lei Zhang · 1 year, 10 months ago
- ec93093 [metal] Cache a command buffer descriptor in device to deduplicate by Lei Zhang · 1 year, 10 months ago
- 02c14aa [metal] Improve order in device creation by Lei Zhang · 1 year, 10 months ago
- 52a8d0c [metal] Return early with IREE macro to flatten status check by Lei Zhang · 1 year, 10 months ago